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Description 

INSECT NUCLEAR RECEPTOR GENES AND USES THEREOF 

Field of the Invention 
5 The present invention generally relates to nuclear receptor genes. 

More particularly, the present invention provides novel nuclear receptor 
nucleic acid and polypeptide sequences, chimeric genes comprising the 
disclosed nuclear receptor sequences, antibodies that specifically recognize 
the disclosed nuclear receptor polypeptides, modulators of nuclear receptor 
10 nucleic acids and polypeptides, and uses thereof. 

Table of Abbreviations 





ATCC 


American Tissue Culture Collection 




(3FTZ-F1 


fushi tarazu transcription factor 1, (3 


15 




isoform 




CDS 


coding sequence 




DBD 


DNA-binding domain 




DERR 


Drosophila estrogen-related receptor 




DFAX1 


Drosophila fax-related gene 1 


20 


DFAX2 


Drosophila fax-related gene 2 




DHR38 


Drosophila hormone receptor 38 




DHR39 


Drosophila hormone receptor 39 




DHR4 


Drosophila hormone receptor 4 




DSF 


dissatisfaction 


25 


dsRNA 


double-stranded RNA 




dsRNAi 


double-stranded RNA interference 




E75A 


ecdysone-inducible gene E75, A isoform 




EcR 


ecdysone receptor 




EGON 


eagle nuclear receptor 


30 


DHR3 


Drosophila hormone receptor 3 




DHR78 


Drosophila hormone receptor 78 




DHR96 


Drosophila hormone receptor 96 
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15 



E78 

EGON 

FCS 

GST 

HMM 

DHNF4 

HR3 

hs 

IPM 

JH 

LBD 

PCR 

PEG 

PTTH 

RACE 

SELDI-TOF MS 



20 



Sf9 cells 

SPR 

USP 



Drosophila ecdysone-inducible protein 78 
eagle 

Fluorescence Correlation Spectroscopy 

glutathione S transferase 

Hidden Markov Model 

Drosophila hepatic nuclear factor 4 

hormone receptor 3 

Drosophila hsp70 promoter 

integrated pest management 

juvenile hormone 

ligand-binding domain 

polymerase chain reaction 

polyethylene glycol 

prothoracicotropic hormone 

rapid amplification of cDNA ends 

Surface-Enhanced Laser Desorption/ 

Ionization Time-of-Flight Mass 

Spectroscopy 

Spodoptera frugiperda cells 
Surface Plasmon Resonance 
ultraspiracle 



Background Art 

Insects contribute or cause many human and animal diseases, and 
25 are responsible for substantial agricultural and property damage. The 
societal costs associated with insect pests in dollars, time, and suffering are 
monumental. To combat these problems, insecticidal compounds have been 
developed and employed. The total worldwide market size for insecticide 
crop protection is over $5 billion, and insecticide products comprise 
30 approximately 32% of world consumption of pesticides. 

Insecticide development has been guided predominantly by leadfinding 
efforts for new chemical structures. According to this strategy, chemical 
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derivatization of a known insecticide is performed, and the synthesized 
compounds are analyzed for insecticidal activity. An alternative approach 
relies on methods for detecting molecular interactions between a candidate 
compound and a target molecule. An ideal target molecule is precisely 
5 regulated during insect development, such that modulation of the activity or 
level of activity of the target molecule results in organismal lethality. High 
throughput screening methods have enabled rapid screening of diverse and 
populous compound libraries for an ability to interact with a target molecule. 
The novel modulators discovered by such methods are useful as 

10 insecticides. 

A target molecule can be further selected based on modulation of the 
target molecule activity that results in lethality during larval development. 
The insect life cycle requires successive larval or nymph stages that are 
devoted to growth such that the animal can increase mass by several 

15 thousand-fold. To sustain this growth, immature insects feed unabated for 
prolonged periods, and thus are particularly deleterious to agricultural crops 
during this developmental stage. 

Each larval instar concludes with molting of the larval cuticle to 
accommodate the changing size of the larvae. The apolysis of the old 

20 cuticle and the synthesis of new cuticle are regulated by the coordinate 
action of juvenile hormone and ecdysone (20-hydroxyecdysone, hereafter 
referred to as ecdysone). The neuropeptide PTTH directs a transient rise in 
hemolymph titer of ecdysone, which is the trigger for molting. A concomitant 
high level of juvenile hormone signals resynthesis of larval cuticle, while low 

25 juvenile hormone levels signal the synthesis of pupal cuticle and 
commitment to metamorphosis. See Nijhout (1994) Insect Hormones , 
Princeton University Press, Princeton, New Jersey. 

Classical endocrinology studies demonstrated that a premature rise in 
hemolymph ecdysone achieved by feeding larvae ecdysone was sufficient to 

30 trigger premature molting. More recently, experiments in Drosophila have 
provided genetic evidence for ecdysone control of larval molting. Mutation of 
the ultraspiracle (usp) gene, which encodes the heterodimeric partner of EcR 
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to form the functional ecdysone receptor, results in larval lethality with 
supernumerary spiracles indicative of an incomplete molt (Perrimon et al. 
(1985) Genetics 11:23-41; Oro et al. (1992) Development 115(2):449-462). 
The ecdysoneless gene encodes a protein required for ecdysone synthesis, 
5 and animals carrying mutations in ecdysoneless show defective larval 
molting and death (Henrich et al. (1987) Dev Biol 120(1 ):50-55; Henrich et 
al. (1993) Dev Genet 14(5):469-477). Similarly, mutations of the dare and 
dre4 genes, which encode proteins required for ecdysone synthesis, result in 
defective molting (Freeman et al. (1999) Development 126:4591-4602; Sliter 

10 & Gilbert (1992) Genetics 130:555-568). These studies demonstrate that 
misregulated ecdysone signaling - either premature or absent - disrupts 
larval molting and leads to larval death. 

Molecular cloning of the ecdysone receptor has provided a foundation 
for understanding the mechanism of ecdysone signaling and the mode of 

15 action for growth regulatory insecticides. The functional ecdysone receptor 
is a heterodimer of the ecdysone receptor (EcR) and ultraspiracle (USP) 
nuclear receptor proteins (Koelle et al. (1991) Cell 67:59-77; Koelle (1992) 
Ph.D. Thesis, Stanford University, Stanford, California; Yao (1993) Cell 
71:63-72; Thomas et al. (1993) Nature 362:471-475). Ecdysone enters cells 

20 by virtue of its hydrophobic structure, and binds the EcR/USP heterodimer. 
The hormone-bound or activated receptor binds DNA at regulatory 
sequences called response elements and therein directs gene transcription. 
A subset of the immediate gene targets of the ecdysone signal, the primary 
response genes, are transcription factors that turn on a large group of 

25 secondary response genes that ultimately lead to altered cell functions. 
Thus, the hormone signal elicits a transcription cascade that both amplifies 
and diversifies the initial signal. 

Several insecticides are known to elicit insect lethality by interfering 
with nuclear receptor signaling. The non-steroidal ecdysone agonists 

30 RH5849 (Wing (1988) Science 241:467-469), RH2485 (methoxyfenozide, 
Dhadialla et al. (1998) Annu Rev Entom 43: 545-569), and RH5992 
(tebufenozide, also known as the insecticide MIMIC®), are chemical ligands 
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for the ecdysone receptor (EcR). See a/so, Dhadialla et al. (1998) Annu Rev 
Entom 43: 545-569, incorporated herein by reference, which describes 
several insecticides with ecdysteroidal and juvenile hormone activity. 

At least seven other nuclear receptor genes are regulated by 
5 ecdysone, implicating their participation in larval growth and molting as well. 
Thummel (1997) BioEssays 19(8):669-672. At a molecular level, these 
nuclear receptors might function in yet undiscovered hormone signaling 
pathways. Alternatively or in addition, they might modulate ecdysone 
signaling by heterodimerization and transregulation. See White et al. (1997) 

10 Science 276(5309:1 14-1 17; Sutherland et al. (1995) Proc Natl Acad Sci USA 
92(17):7966-7970. 

Functional analysis of orphan nuclear receptors can be addressed 
using standard molecular and genetic techniques in Drosophila 
melanogaster (herein after "Drosophila"). In addition to EcR and USP, 

15 fourteen nuclear receptor genes have been identified in Drosophila: knirps, 
knirps-related {knrl\ egon/eagle (eg), seven-up (svp), tailless {til), hepatic 
nuclear factor 4 (DHNF4), fushi tarazu factor 1 (FTZ-F1), ecdysone-inducible 
protein 75 (E75), ecdysone-inducible protein 78 (E78), hormone receptor 3 
(DHR3), hormone receptor 38 (DHR38), hormone receptor 39 (DHR39), 

20 hormone receptor 78 (DHR78), and hormone receptor 96 (DHR96) 
(reviewed in Thummel (1995) Cell 83:1-20). All such receptors, with the 
exception of EcR and USP, are designated orphan nuclear receptors to 
signify that corresponding endogenous ligands have not been identified. 
The insect proteins KNIRPS, KNIRPS-RELATED, and EGON lack a 

25 discernible ligand binding domain, but are classified as nuclear receptors 
based on the distinctive sequence of their DNA binding domains (Nauber et 
al. (1988) Nature 336(61 98):489-492; Oro et al. (1988) Nature 
336(61 989):493-496; Higashijima et al. (1996) Development 122(2):527- 
536). 

30 There exists a continuing demand for insecticides that show improved 

efficacy and new modes of action. To this end, the present invention 
discloses a functional characterization of nuclear receptors during 
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Drosophila larval development. Nuclear receptors that confer larval lethality 
when misregulated are used in biochemical assays as targets for insecticide 
development. The present invention also discloses several novel insect 
nuclear receptor polypeptides and nucleic acid molecules encoding the 
5 same that are further useful as components of gene switch technology for 
inducible gene expression. 

Summary of the Invention 
The present invention discloses isolated insect nuclear receptor 

10 polypeptides and isolated nucleic acid molecules encoding the same. 
Preferably, an isolated insect nuclear receptor polypeptide, or functional 
portion thereof, comprises a polypeptide encoded by the nucleic acid 
molecule of any one of SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, and 25; a 
polypeptide encoded by a nucleic acid molecule that is substantially identical 

15 to any one of SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, and 25; a polypeptide 
having an amino acid sequence of any one of SEQ ID NOs:2, 6, 10, 14, 18, 
20, 22, 24, and 26; a polypeptide that is a biological equivalent of any one of 
SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26; or a polypeptide that is 
immunologically cross-reactive with an antibody that shows specific binding 

20 with a polypeptide comprising some or all amino acids of any one of SEQ ID 
NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26. 

The present invention further teaches chimeric genes having a 
heterologous promoter that drives expression of a nucleic acid sequence 
encoding an insect nuclear receptor polypeptide. Preferably, the chimeric 

25 gene is carried in a vector and introduced into a host cell so that an insect 
nuclear receptor polypeptide of the present invention is produced. Preferred 
host cells include but are not limited to a bacterial cell, an insect cell, and a 
plant cell. 

In another aspect of the invention, a method is provided for detecting 
30 a nucleic acid molecule that encodes an insect nuclear receptor polypeptide. 
According to the method, a biological sample having nucleic acid material is 
hybridized under stringent hybridization conditions to an insect nuclear 
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receptor nucleic acid molecule of the present invention. Such hybridization 
enables a nucleic acid molecule of the biological sample and the insect 
nuclear receptor nucleic acid molecule to form a detectable duplex structure. 
Preferably, the insect nuclear receptor nucleic acid molecule includes some 
5 or all nucleotides of any one of SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, and 

25. The present invention further teaches an antibody that specifically 
recognizes an insect nuclear receptor polypeptide. Preferably, the antibody 
recognizes some or all amino acids of any one of SEQ ID NOs:2, 6, 10, 14, 
18, 20, 22, 24, and 26. A method for producing an insect nuclear receptor 

10 antibody is also disclosed, and the method comprises recombinantly or 
synthetically producing an insect nuclear receptor polypeptide, or portion 
thereof, as set forth in any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, 
and 26; formulating the insect nuclear receptor polypeptide so that it is an 
effective immunogen; immunizing an animal with the formulated polypeptide 

15 to generate an immune response that includes production of insect nuclear 
receptor antibodies; and collecting blood serum from the immunized animal 
containing antibodies that specifically recognize an insect nuclear receptor 
polypeptide. Antibody-producing cells can be optionally fused with an 
immortal cell line whereby a monoclonal antibody that specifically recognizes 

20 an insect nuclear receptor polypeptide can be selected. 

A method is also provided for detecting a level of insect nuclear 
receptor polypeptide using an antibody that recognizes an insect nuclear 
receptor polypeptide of any of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 

26. According to the method, a biological sample is obtained from an 
25 experimental subject and a control subject, and an insect nuclear receptor 

polypeptide is detected in the sample by immunochemical reaction with the 
insect nuclear receptor antibody. Preferably, the antibody recognizes amino 
acids of any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26; and is 
prepared according to a method of the present invention for producing such 
30 an antibody. 

The present invention further discloses a method for identifying a 
compound that modulates nuclear receptor function. The method 
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comprises: (a) exposing an isolated insect nuclear receptor polypeptide of 
any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, 26 to one or more 
compounds, and (b) assaying binding of a compound to the isolated insect 
nuclear receptor polypeptide. A compound is selected that demonstrates 
5 specific binding to the isolated insect nuclear receptor polypeptide. 
Preferably, the modulator is a chemical compound, a protein, a peptide, a 
nucleic acid, or an antibody, and was prepared according to a method 
disclosed herein. 

The present invention also provides a method for identifying an 

10 insecticidal compound that modulates nuclear receptor function. The 
method comprises: (a) isolating an insect nuclear receptor polypeptide of 
any one of even numbered SEQ ID NOs:2-34, wherein modulation of the 
insect nuclear receptor polypeptide confers lethality of an insect during a 
larval stage; (b) exposing the isolated insect nuclear receptor polypeptide to 

15 a plurality of substances; (c) assaying binding of a substance to the isolated 
nuclear receptor polypeptide; and (d) selecting a substance that 
demonstrates specific binding to the isolated insect nuclear receptor 
polypeptide. Preferably, the modulator is a chemical compound, a protein, a 
peptide, a nucleic acid, or an antibody, and was prepared according to a 

20 method disclosed herein. 

The present invention further provides a method for preventing or 
treating an insect infestation of a plant, the method comprising: (a) preparing 
an insecticidal composition that is a modulator of an insect nuclear receptor 
set forth as any one of even-numbered SEQ ID NOs:2-34; and (b) contacting 

25 an effective dose of the insecticidal composition with a plant, whereby an 
insect infestation of the plant is prevented or abrogated. Preferably, the 
insecticidal composition comprises a chemical compound, a protein, a 
peptide, a nucleic acid, or an antibody, and was prepared according to a 
method disclosed herein. Preferably, the insect infestation is abrogated by 

30 lethality of the insect. In one embodiment, the insecticidal composition also 
displays nematicide activity, such that contacting an effective dose of the 
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insecticidal composition with a plant prevents or abrogates a nematode 
infestation of the plant. 

The present invention further provides a method for preventing or 
abrogating an insect infestation of a plant, the method comprising: (a) 
5 expressing in a plant an insect nuclear receptor modulator that modulates 
the activity of an insect nuclear receptor polypeptide of any one of even- 
numbered SEQ ID NOs:2-34, whereby an insect infestation of a plant is 
prevented or abrogated. Preferably, the insecticidal composition comprises 
a protein, a peptide, a nucleic acid, or an antibody. In one embodiment, the 

10 insecticidal composition additionally displays nematicidal activity, such that 
expression of insect nuclear receptor modulator in a plant prevents or 
abrogates a nematode infestation of the plant. The present invention further 
embodies plants, plant tissues, plant seeds, and plant cells that express an 
insect nuclear receptor modulator and that are therefore able to inhibit plant 

1 5 parasitic infestation. 

The present invention also discloses a chimeric nuclear receptor 
cassette comprising a DNA binding domain, a ligand binding domain, a 
hinge domain, and an activation or repression domain, wherein one or more 
of the DNA binding domain, ligand binding domain, hinge domain, or 

20 activation of repression domain is identical or substantially identical to a 
portion of any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26. 

Also disclosed is a method of inducing expression of a target nucleic 
acid sequence. In a preferred embodiment, the method comprises: (a) 
constructing a chimeric nuclear receptor expression cassette wherein one or 

25 more of the DNA binding domain, ligand binding domain, and 
activation/repression domains is identical or substantially identical to a 
portion of any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26; (b) 
constructing a target expression cassette having a target nucleotide 
sequence and a cis-regulatory element that is recognized by a DNA binding 

30 domain of the chimeric nuclear; (c) expressing the chimeric nuclear receptor 
expression cassette and the target expression cassette in a heterologous 
organism; and (d) contacting a ligand that binds to the ligand binding domain 
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of the chimeric nuclear receptor with the organism, whereby the target 
nucleotide sequence is expressed. In one embodiment, the method is 
performed to induce gene expression in a plant. The present invention also 
encompasses plants, plant tissues, plant seeds, and plant cells comprising a 
5 disclosed nuclear receptor expression cassette and a target expression 
cassette. 

Accordingly, it is an object of the present invention to provide novel 
insect nuclear receptor nucleic acids and polypeptides, and novel methods 
relating thereto. This object is achieved in whole or in part by the present 
10 invention. 

An object of the invention having been stated above, other objects 
and advantages of the present invention will become apparent to those 
skilled in the art after a study of the following description of the invention, 
Figures, and non-limiting Examples. 

15 

Brief Description of the Drawings 
Figure 1 is a neighbor-joining tree generated using the pileup feature 
of the GCG sequence analysis program (Devereux et al. (1984) Nuc Acids 
Res 12:387-395). The tree depicts relationships among the homeodomain 

20 regions of Drosophila nuclear receptors (eagle, Gen Bank Accession No. 
D43634; knirps-related, GenBank Accession No. X14153; knirps, GenBank 
Accession No. X13331 ; DHR4, GenBank Accession No. AL035245 and SEQ 
ID NO: 14; dissatisfaction, GenBank Accession No. AF1 06677; tailless, 
GenBank Accession No. AF019362; FTZ-F1, GenBank Accession No. 

25 M98397; DHR39, GenBank Accession No. L07551; E78, GenBank 
Accession No. U01087; E75, GenBank Accession No. X51548; DHR3, 
GenBank Accession No. M90806; DHR78, GenBank Accession No. U36791; 
sevenup, GenBank Accession No. M28863; USP, GenBank Accession No. 
X53417; DHNF4, GenBank Accession No. U70874; DHR38, GenBank 

30 Accession No. X89246; EcR, GenBank Accession No. M74078; DHR96, 
GenBank Accession No. U36792; DERR, SEQ ID NO:2; DFAX1, SEQ ID 
NO:6; DFAX2, SEQ ID NO:10). 
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Figures 2A-2B presents an alignment among USP proteins derived 
from the indicated insect species. B.mori, Bombyx mori USP (GenBank 
Accession No. AAC13750; SEQ ID NO:43); M.sexta, Manduca sexta USP 
(GenBank Accession No. AAB64234; SEQ ID NO:44); C.fumiferana, 
5 Choristoneura fumiferana USP (GenBank Accession No. AAC31795; SEQ 
ID NO:45); H.virescens, Heliothis virescens USP (SEQ ID NO:20); 
L.migratoria, Locusta migratoria USP (GenBank Accession No. AAF00981; 
SEQ ID NO:46); C.tentans, Chironomus tentans USP (GenBank Accession 
No. AAC03056; SEQ ID NO:47); D.melanogaster, Drosophila melanogaster 
10 USP (GenBank Accession No. X53417; SEQ ID NO:48). The core DBD is 
underlined. 

Figures 3A-3B presents an alignment among FTZ-F1 proteins derived 
from the indicated insect species, hv.bftz, Heliothis FTZ-F1 (SEQ ID NO:18); 
bmftz, Bombyx mori FTZ-F1 (GenBank Accession No. P49867; SEQ ID 
15 NO:49); dmftz, Drosophila pFTZ-F1 (GenBank Accession No. M98397; SEQ 
ID NO:32). 

Figures 4A-4F presents an alignment among E75 proteins derived 
from the indicated insect species. D.melanogaster (A), Drosophila E75A 
(GenBank Accession No. A34598; SEQ ID NO:34); M.sexta (A), Manduca 

20 sexta E75A (GenBank Accession No. Q08893; SEQ ID NO:50); 
D.melanogaster (B), Drosophila E75B (GenBank Accession No. B34598; 
SEQ ID NO:51); M.sexta (B), Manduca sexta E75B (GenBank Accession 
No. C56591; SEQ ID NO:52); D.melanogaster (C), Drosophila E75C 
(GenBank Accession No. P13055; SEQ ID NO:53); C. fumiferana, 

25 Choristoneura fumiferana E75 (GenBank Accession No. 001639; SEQ ID 
NO:54); G. mellonella, Galleria mellonella E75 (GenBank Accession No. 
P50239; SEQ ID NO:55); M.ensis, Metapenaeus ensis E75 (GenBank 
Accession No. AAC71770; SEQ ID NO:56); H.virescens, Heliothis virescens 
(SEQ ID NO:18). The core DBD is underlined. 

30 Figure 5 is a bar graph that depicts percentage survival of Drosophila 

following injection of dsRNA corresponding to the indicated nuclear 
receptors as described in Example 5 herein below. Solid bar, DHR3; gray 
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bar, DHR4; open bar, EGON; cross-hatched bar, FTZ-F1; wavy bar, DSF; 
checkerboard bar, DERR; stippled bar, FAX1; vertical line bar, FAX2; 
horizontal line bar, buffer (control). 

Figure 6 is a bar graph that depicts Drosophila larval lethality induced 
5 by overexpression of the indicated nuclear receptors. Gray bars, larvae that 
were not heat treated; solid bars, larvae that were heat treated at 0-2 hours 
post-hatching; open bars, larvae that were heat treated at 20-22 hours post- 
hatching; control, w 1118 larvae; DHR38, yw\ P[hs-DHR38 larvae; 
DHR39, w 1118 \ P[hs-DHR39-6 w+] or w 1118 \ P[hs-DHR39-3 w+] larvae; E75A, 
10 w 1118 \ +/SM5; P[hs-E75A w*]UM3 or w 1118 ; +/SM5; P[hs-E75A w*]fTM3 
larvae. Error bars indicate standard deviation. 

Brief Description of Sequences in the Sequence Listing 
Odd-numbered SEQ ID NOs:1-31 are nucleotide sequences 
1 5 described in Table 1 . 

Even-numbered SEQ ID NOs:2-32 are protein sequences encoded by 
the immediately preceding nucleotide sequence, e.g., SEQ ID NO:2 is the 
protein encoded by the nucleotide sequence of SEQ ID NO:1, SEQ ID NO:4 
is the protein encoded by the nucleotide sequence of SEQ ID NO:3, etc. 
20 SEQ ID NOs:35-42 are PCR primers. 

SEQ ID NOs:43-56 are protein sequences available from GenBank 
that are presented in the Figures for comparison with novel sequences 
disclosed herein. 
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Table 1 . Sequence Listing Summary 



SE Q ID N O 

1-2 

3-4 

5-6 

7-8 
9-1 6" 
1 1-12 



d escri ption 



S Drosophila gm estroten-relat ed re ceptor (gmD ERR ) 
\ Drosophila estrogen-related receptor (DERR) 
iDrosophila fax-related receptor 1 (gmDFAXl j 
IDrosophila f ax -related receptor 1 (DFAX1) 
\Droso phila gm fax-related receptor 2 (gmDFAX2) 
IDroso phila fax-related receptor 2 (DFAX2) 



13-14 IDrosophila hormone receptor 4JDHR4) 



15-16 
1 7-18 



Drosophila dissatisfaction (DSF) 



19-20 
21-22 



Heliothis fushi tarazu factor 1 (FTZ-F1) 

Heliothis ecdysone-inducible protein 75 ( E75 ) 
Heliothis ultraspiracle ( USP ) 



23 -24 
25-26 



27-28 
29-30 



1 



Heliothis he patic nuclear facto r 4 (HNF4) 
Heliothis hormone receptor 3 (HR3) 



Droso phila hormo ne receptor 38 ( DHR38 ) 

| Drosophila h ormo ne receptor 3 9 (DHR39) 

Drosophila fushi tarazu factor 1, p isoform ( (3FTZ-F1) 

[Drosophil a E75A _ _ 

j pFTZ-F 1 degenerate forward primer A 

(3FTZ-F1 degenerate forward primer B 



PFTZ-F1 degenerate reverse primer A 



PFTZ-F1 degenerate reverse primer B 



PFTZ-F1 forward primer A 



PFTZ-F1 forward primer B 
)pFTZ-F1 reverse primer A 
)pFTZ-F1 reverse primer B 



Bombyx ultraspiracle (USP) 



[Manduca u ltrasp|racle (USP) 

I oriso n eura ultra s pi rac le (USP ) 
\Locusta ultraspiracle (USP) 
[ Chironomus ultraspiracle (USP) 

iDroso p hila ultra sp i racle (USP) 

; Bombyx FTZ-F1 
S Manduca E75A 



J Drosophila J=Z5B 
Manduca E75B 
Drosophila E75C 



Choristoneura E75 
Galleria E75 " ^ 
Matapenaeus E75 
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Detailed Description of the Invention 
L Definitions 

While the following terms are believed to be well understood by one of 
ordinary skill in the art, the following definitions are set forth to facilitate 
5 explanation of the invention. 

LA. Nucleic acids 

The nucleic acid molecules provided by the present invention include 
the isolated nucleic acid molecules of any one of SEQ ID NOs:1, 5, 9, 13, 
17, 19, 21, 23, and 25; sequences substantially identical to sequences of 

10 any one of SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, and 25; conservative 
variants thereof, subsequences and elongated sequences thereof, 
complementary DNA molecules, and corresponding RNA molecules. The 
present invention also encompasses genes, cDNAs, chimeric genes, and 
vectors comprising disclosed nuclear receptor nucleic acid sequences. 

15 The term "nucleic acid molecule" refers to deoxyribonucleotides or 

ribonucleotides and polymers thereof in either single- or double-stranded 
form. Unless specifically limited, the term encompasses nucleic acids 
containing known analogues of natural nucleotides that have similar 
properties as the reference natural nucleic acid. Unless otherwise indicated, 

20 a particular nucleotide sequence also implicitly encompasses conservatively 
modified variants thereof (e.g., degenerate codon substitutions), 
complementary sequences, subsequences, elongated sequences, as well as 
the sequence explicitly indicated. The terms "nucleic acid molecule" or 
"nucleotide sequence" can also be used in place of "gene", "cDNA", or 

25 "mRNA". Nucleic acids can be derived from any source, including any 
organism. 

The term "isolated", as used in the context of a nucleic acid molecule, 
indicates that the nucleic acid molecule exists apart from its native 
environment and is not a product of nature. An isolated DNA molecule can 
30 exist in a purified form or can exist in a non-native environment such as a 
transgenic host cell. 
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The term "purified", when applied to a nucleic acid, denotes that the 
nucleic acid is essentially free of other cellular components with which it is 
associated in the natural state. Preferably, a purified nucleic acid molecule 
is a homogeneous dry or aqueous solution. The term "purified" denotes that 
5 a nucleic acid gives rise to essentially one band in an electrophoretic gel. 
Particularly, it means that the nucleic acid is at least about 50% pure, more 
preferably at least about 85% pure, and most preferably at least about 99% 
pure. 

The term "substantially identical", in the context of two nucleotide 

10 sequences, refers to two or more sequences or subsequences that have at 
least 60%, preferably about 70%, more preferably about 80%, more 
preferably about 90-95%, and most preferably about 99% nucleotide identity, 
when compared and aligned for maximum correspondence, as measured 
using one of the following sequence comparison algorithms (described 

15 herein below under the heading "Nucleotide and Amino Acid Sequence 
Comparisons" or by visual inspection. Preferably, the substantial identity 
exists in nucleotide sequences of at least 50 residues, more preferably in 
nucleotide sequence of at least about 100 residues, more preferably in 
nucleotide sequences of at least about 150 residues, and most preferably in 

20 nucleotide sequences comprising complete coding sequences. In one 
aspect, polymorphic sequences can be substantially identical sequences. 
The term "polymorphic" refers to the occurrence of two or more genetically 
determined alternative sequences or alleles in a population. An allelic 
difference can be as small as one base pair. 

25 Another indication that two nucleotide sequences are substantially 

identical is that the two molecules specifically or substantially hybridize to 
each other under stringent conditions. In the context of nucleic acid 
hybridization, two nucleic acid sequences being compared can be 
designated a "probe" and a "target". A "probe" is a reference nucleic acid 

30 molecule, and a "'target" is a test nucleic acid molecule, often found within a 
heterogeneous population of nucleic acid molecules. A "target sequence" is 
synonymous with a "test sequence". 
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A preferred nucleotide sequence employed for hybridization studies or 
assays includes probe sequences that are complementary to or mimic at 
least an about 14 to 40 nucleotide sequence of a nucleic acid molecule of 
the present invention. Preferably, probes comprise 14 to 20 nucleotides, or 
5 even longer where desired, such as 30, 40, 50, 60, 100, 200, 300, or 500 
nucleotides or up to the full length of any of those set forth as SEQ ID 
NOs:1, 5, 9, 13, 17, 19, 21, 23, and 25. Such fragments can be readily 
prepared by, for example, directly synthesizing the fragment by chemical 
synthesis, by application of nucleic acid amplification technology, or by 
10 introducing selected sequences into recombinant vectors for recombinant 
production. 

The phrase "hybridizing specifically to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide 
sequence under stringent conditions when that sequence is present in a 

15 complex nucleic acid mixture (e.g., total cellular DNA or RNA). 

The phrase "hybridizing substantially to" refers to complementary 
hybridization between a probe nucleic acid molecule and a target nucleic 
acid molecule and embraces minor mismatches that can be accommodated 
by reducing the stringency of the hybridization media to achieve the desired 

20 hybridization. 

"Stringent hybridization conditions" and "stringent hybridization wash 
conditions" in the context of nucleic acid hybridization experiments such as 
Southern and Northern blot analysis are both sequence- and environment- 
dependent. Longer sequences hybridize specifically at higher temperatures. 

25 An extensive guide to the hybridization of nucleic acids is found in Tijssen 
(1993) Laboratory Techniques in Biochemistry and Molecular Biology- 
Hybridization with Nucleic Acid Probes , part I chapter 2, Elsevier, New York, 
New York. Generally, highly stringent hybridization and wash conditions are 
selected to be about 5°C lower than the thermal melting point (T m ) for the 

30 specific sequence at a defined ionic strength and pH. Typically, under 
"stringent conditions" a probe will hybridize specifically to its target 
subsequence, but to no other sequences. 



WO 02/077157 



PCT/US02/11257 



-17- 

The T m is the temperature (under defined ionic strength and pH) at 
which 50% of the target sequence hybridizes to a perfectly matched probe. 
Very stringent conditions are selected to be equal to the T m for a particular 
probe. An example of stringent hybridization conditions for Southern or 
5 Northern Blot analysis of complementary nucleic acids having more than 
about 100 complementary residues is overnight hybridization in 50% 
formamide with 1 mg of heparin at 42°C. An example of highly stringent 
wash conditions is 15 minutes in 0.1x SSC, SM NaCI at 65°C. An example 
of stringent wash conditions is 15 minutes in 0.2X SSC buffer at 65°C (See 

10 Sambrook et al., eds (1989) Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York for a 
description of SSC buffer). Often, a high stringency wash is preceded by a 
low stringency wash to remove background probe signal. An example of 
medium stringency wash conditions for a duplex of more than about 100 

15 nucleotides, is 15 minutes in 1X SSC at 45°C. An example of low stringency 
wash for a duplex of more than about 100 nucleotides, is 15 minutes in 4-6X 
SSC at 40°C. For short probes (e.g., about 10 to 50 nucleotides), stringent 
conditions typically involve salt concentrations of less than about 1 M Na + ion, 
typically about 0.01 to 1M Na + ion concentration (or other salts) at pH 7.0- 

20 8.3, and the temperature is typically at least about 30°C. Stringent conditions 
can also be achieved with the addition of destabilizing agents such as 
formamide. In general, a signal to noise ratio of 2-fold (or higher) than that 
observed for an unrelated probe in the particular hybridization assay 
indicates detection of a specific hybridization. 

25 The following are examples of hybridization and wash conditions that 

can be used to clone homologous nucleotide sequences that are 
substantially identical to reference nucleotide sequences of the present 
invention: a probe nucleotide sequence preferably hybridizes to a target 
nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5M NaP0 4 , 

30 1mM EDTA at 50°C followed by washing in 2X SSC, 0.1% SDS at 50°C; 
more preferably, a probe and target sequence hybridize in 7% sodium 
dodecyl sulfate (SDS), 0.5M NaPQ 4l 1mM EDTA at 50°C followed by 
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washing in 1X SSC, 0.1% SDS at 50°C; more preferably, a probe and target 
sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaP0 4 , 1mM 
EDTA at 50°C followed by washing in 0.5X SSC, 0.1% SDS at 50°C; more 
preferably, a probe and target sequence hybridize in 7% sodium dodecyl 
5 sulfate (SDS), 0.5M NaP0 4 , 1mM EDTA at 50°C followed by washing in 
0.1X SSC, 0.1% SDS at 50°C; more preferably, a probe and target 
sequence hybridize in 7% sodium dodecyl sulfate (SDS), 0.5M NaP0 4 , 1mM 
EDTA at 50°C followed by washing in 0.1 X SSC, 0.1% SDS at 65°C. 

A further indication that two nucleic acid sequences are substantially 

10 identical is that proteins encoded by the nucleic acids are substantially 
identical, share an overall three-dimensional structure, are biologically 
functional equivalents, or are immunologically cross-reactive. These terms 
are defined further under the heading "Polypeptides" herein below. Nucleic 
acid molecules that do not hybridize to each other under stringent conditions 

15 are still substantially identical if the corresponding proteins are substantially 
identical. This can occur, for example, when two nucleotide sequences are 
significantly degenerate as permitted by the genetic code. 

The term "conservatively substituted variants" refers to nucleic acid 
sequences having degenerate codon substitutions wherein the third position 

20 of one or more selected (or all) codons is substituted with mixed-base and/or 
deoxyinosine residues (Batzer et al. (1991) Nuc Acids Res 19:5081; Ohtsuka 
et al. (1985) J Biol Chem 260:2605-2608; Rossolini et al. (1994) Mol Cell 
Probes 8:91-98). 

The term "subsequence" refers to a sequence of nucleic acids that 
25 comprises a part of a longer nucleic acid sequence. An exemplary 
subsequence is a probe, described herein above, or a primer. The term 
"primer" as used herein refers to a contiguous sequence comprising about 8 
or more deoxyribonucleotides or ribonucleotides, preferably 10-20 
nucleotides, and more preferably 20-30 nucleotides of a selected nucleic 
30 acid molecule. The primers of the invention encompass oligonucleotides of 
sufficient length and appropriate sequence so as to provide initiation of 
polymerization on a nucleic acid molecule of the present invention. 



WO 02/077157 



PCT/US02/11257 



-19- 

The term "elongated sequence" refers to an addition of nucleotides (or 
other analogous molecules) incorporated into the nucleic acid. For example, 
a polymerase (e.g., a DNA polymerase) can add sequences at the 3' 
terminus of the nucleic acid molecule. In addition, the nucleotide sequence 
5 can be combined with other DNA sequences, such as promoters, promoter 
regions, enhancers, polyadenylation signals, intronic sequences, additional 
restriction enzyme sites, multiple cloning sites, and other coding segments. 

The term "complementary sequences", as used herein, indicates two 
nucleotide sequences that comprise antiparallel nucleotide sequences 

10 capable of pairing with one another upon formation of hydrogen bonds 
between base pairs. As used herein, the term "complementary sequences" 
means nucleotide sequences which are substantially complementary, as can 
be assessed by the same nucleotide comparison set forth above, or is 
defined as being capable of hybridizing to the nucleic acid segment in 

15 question under relatively stringent conditions such as those described 
herein. A particular example of a complementary nucleic acid segment is an 
antisense oligonucleotide. 

The term "gene" refers broadly to any segment of DNA associated 
with a biological function. A gene encompasses sequences including but not 

20 limited to a coding sequence, a promoter region, a cis-regulatory sequence, 
a non-expressed DNA segment that is a specific recognition sequence for 
regulatory proteins, a non-expressed DNA segment that contributes to gene 
expression, a DNA segment designed to have desired parameters, or 
combinations thereof. A gene can be obtained by a variety of methods, 

25 including cloning from a biological sample, synthesis based on known or 
predicted sequence information, and recombinant derivation of an existing 
sequence. 

The term "gene expression" generally refers to the cellular processes 
by which a biologically active polypeptide is produced from a DNA sequence. 
30 The present invention also encompasses chimeric genes comprising 

the disclosed nuclear receptor sequences. The term "chimeric gene", as 
used herein, refers to a promoter region operatively linked to a nuclear 
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receptor coding sequence, a nucleotide sequence producing an antisense 
RNA molecule, a RNA molecule having tertiary structure, such as a hairpin 
structure, or a double-stranded RNA molecule. 

The term "operatively linked", as used herein, refers to a promoter 
5 region that is connected to a nucleotide sequence in such a way that the 
transcription of that nucleotide sequence is controlled and regulated by that 
promoter region. Techniques for operatively linking a promoter region to a 
nucleotide sequence are known in the art. 

The terms "heterologous gene", "heterologous DNA sequence", 

10 "heterologous nucleotide sequence", "exogenous nucleic acid molecule", or 
"exogenous DNA segment", as used herein, each refer to a sequence that 
originates from a source foreign to an intended host cell or, if from the same 
source, is modified from its original form. Thus, a heterologous gene in a 
host cell includes a gene that is endogenous to the particular host cell but 

15 has been modified, for example by mutagenesis or by isolation from native 
cis-regulatory sequences. The terms also include non-naturally occurring 
multiple copies of a naturally occurring nucleotide sequence. Thus, the 
terms refer to a DNA segment that is foreign or heterologous to the cell, or 
homologous to the cell but in a position within the host cell nucleic acid 

20 wherein the element is not ordinarily found. 

The term "transcription factor" generally refers to a protein that 
modulates gene expression by interaction with the cis-regulatory element 
and cellular components for transcription, including RNA Polymerase, 
Transcription Associated Factors (TAFs), chromatin-remodeling proteins, 

25 and any other relevant protein that impacts gene transcription. 

The present invention further includes vectors comprising the 
disclosed nuclear sequences, including plasmids, cosmids, and viral vectors. 
The term "vector", as used herein refers to a DNA molecule having 
sequences that enable its replication in a compatible host cell. A vector also 

30 includes nucleotide sequences to permit ligation of nucleotide sequences 
within the vector, wherein such nucleotide sequences are also replicated in a 
compatible host cell. A vector can also mediate recombinant production of a 
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nuclear receptor polypeptide, as described further herein below. A preferred 
host cell is a bacterial cell, an insect cell, or a plant cell. 

Nucleic acids of the present invention can be cloned, synthesized, 
recombinantly altered, mutagenized, or combinations thereof. Standard 
5 recombinant DNA and molecular cloning techniques used to isolate nucleic 
acids are known in the art. Exemplary, non-limiting methods are described 
by Sambrook et al., eds (1989) Molecular Cloning , Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York; by Silhavy et al. (1984) 
Experiments with Gene Fusions . Cold Spring Harbor Laboratory Press, Cold 

10 Spring Harbor, New York; by Ausubel et al. (1992) Current Protocols in 
Molecular Biology , John Wylie and Sons, Inc., New York, New York; and by 
Glover, ed (1985) DNA Cloning: A Practical Approach , MRL Press, Ltd., 
Oxford, United Kingdom. Site-specific mutagenesis to create base pair 
changes, deletions, or small insertions are also known in the art as 

15 exemplified by publications. See e.g., Adelman et al. (1983) DNA 2:183; 
Sambrook et al., eds (1989) Molecular Cloning , Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York. 

Sequences detected by methods of the invention can be detected, 
subcloned, sequenced, and further evaluated by any measure known in the 

20 art using any method usually applied to the detection of a specific DNA 
sequence including but not limited to dideoxy sequencing, PCR, oligomer 
restriction (Saiki et al. (1985) Bio/Technology 3:1008-1012), allele-specific 
oligonucleotide (ASO) probe analysis (Conner et al. (1983) Proc Natl Acad 
Sci USA 80:278), and oligonucleotide ligation assays (OLAs) (Landgren et 

25 al. (1988) Science 241:1007). See also Landgren et al. (1988) Science 
242:229-237. 

LB. Polypeptides 

The polypeptides provided by the present invention include the 
isolated polypeptides set forth as SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, 
30 and 26; polypeptides substantially identical to SEQ ID NOs:2, 6, 10, 14, 18, 
20, 22, 24, and 26; nuclear receptor polypeptide fragments (preferably 
biologically functional fragments, e.g. the domains described herein), fusion 
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proteins comprising the disclosed nuclear receptor amino acid sequences, 
biologically functional analogs, and polypeptides that cross-react with an 
antibody that specifically recognizes a disclosed nuclear receptor 
polypeptide. 

5 The term "isolated", as used in the context of a polypeptide, indicates 

that the polypeptide exists apart from its native environment and is not a 
product of nature. An isolated polypeptide can exist in a purified form or can 
exist in a non-native environment such as, for example, in a transgenic host 
cell. 

10 The term "purified", when applied to a polypeptide, denotes that the 

polypeptide is essentially free of other cellular components with which it is 
associated in the natural state. Preferably, a polypeptide is a homogeneous 
solid or aqueous solution. Purity and homogeneity are typically determined 
using analytical chemistry techniques such as polyacrylamide gel 

15 electrophoresis or high performance liquid chromatography. A polypeptide 
that is the predominant species present in a preparation is substantially 
purified. The term "purified" denotes that a polypeptide gives rise to 
essentially one band in an electrophoretic gel. Particularly, it means that the 
polypeptide is at least about 50% pure, more preferably at least about 85% 

20 pure, and most preferably at least about 99% pure. 

The term "substantially identical" in the context of two or more 
polypeptide sequences is measured as polypeptide sequences having about 
35%, or 45%, or preferably from 45-55%, or more preferably 55-65% of 
identical or functionally equivalent amino acids. Even more preferably, two 

25 or more "substantially identical" polypeptide sequences will have about 70%, 
or even more preferably about 80%, still more preferably about 90%, still 
more preferably about 95%, and most preferably about 99% identical or 
functionally equivalent amino acids. Percent "identity" and methods for 
determining identity are defined herein below under the heading "Nucleotide 

30 and Amino Acid Sequence Comparisons". 

Substantially identical polypeptides also encompass two or more 
polypeptides sharing a conserved three-dimensional structure. 
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Computational methods can be used to compare structural representations, 
and structural models can be generated and easily tuned to identify 
similarities around important active sites or ligand binding sites. See 
Henikoff et al. (2000) Electrophoresis 21 (9): 1700-1 706; Huang et al. (2000) 
5 Pac Symp Biocomput 230-241; Saqi et al. (1999) Bioinformatics 15(6):521- 
522; and Barton (1998) Acta Crystallogr D Biol Crystallogr 54:1 139-1 146. 

The term "functionally equivalent" in the context of amino acid 
sequences is known in the art and is based on the relative similarity of the 
amino acid side-chain substituents. See Henikoff & Henikoff (2000) Adv 

10 Protein Chem 54:73-97. Relevant factors for consideration include side- 
chain hydrophobicity, hydrophilicity, charge, and size. For example, 
arginine, lysine, and histidine are all positively charged residues; that 
alanine, glycine, and serine are all of similar size; and that phenylalanine, 
tryptophan, and tyrosine all have a generally similar shape. By this analysis, 

15 described further herein below, arginine, lysine, and histidine; alanine, 
glycine, and serine; and phenylalanine, tryptophan, and tyrosine; are defined 
herein as biologically functional equivalents. 

In making biologically functional equivalent amino acid substitutions, 
the hydropathic index of amino acids can be considered. Each amino acid 

20 has been assigned a hydropathic index on the basis of their hydrophobicity 
and charge characteristics, these are: isoleucine (+ 4.5); valine (+ 4.2); 
leucine (+ 3.8); phenylalanine (+ 2.8); cysteine (+ 2.5); methionine (+ 1 .9); 
alanine (+ 1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (- 
0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); 

25 glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and 
arginine (-4.5). 

The importance of the hydropathic amino acid index in conferring 
interactive biological function on a protein is generally understood in the art 
(Kyte et al. (1982) J Mol Biol 157:105). It is known that certain amino acids 
30 can be substituted for other amino acids having a similar hydropathic index 
or score and still retain a similar biological activity. In making changes based 
upon the hydropathic index, the substitution of amino acids whose 
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hydropathic indices are within ±2 of the original value is preferred, those that 
are within ±1 of the original value are particularly preferred, and those within 
±0.5 of the original value are even more particularly preferred. 

It is also understood in the art that the substitution of like amino acids 
5 can be made effectively on the basis of hydrophilicity. U.S. Patent No. 
4,554,101 states that the greatest local average hydrophilicity of a protein, 
as governed by the hydrophilicity of its adjacent amino acids, correlates with 
its immunogenicity and antigenicity, e.g., with a biological property of the 
protein. It is understood that an amino acid can be substituted for another 
10 having a similar hydrophilicity value and still obtain a biologically equivalent 
protein. 

As detailed in U.S. Patent No. 4,554,101, the following hydrophilicity 
values have been assigned to amino acid residues: arginine (+ 3.0); lysine (+ 
3.0); aspartate (+ 3.0±1); glutamate (+ 3.0±1); serine (+ 0.3); asparagine (+ 
15 0.2); glutamine (+ 0.2); glycine (0); threonine (-0.4); proline (-0.5±1); alanine 
(-0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine 
(-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (- 
3.4). 

In making changes based upon similar hydrophilicity values, the 
20 substitution of amino acids whose hydrophilicity values are within ±2 of the 
original value is preferred, those that are within ±1 of the original value are 
particularly preferred, and those within ±0.5 of the original value are even 
more particularly preferred. 

The present invention also encompasses nuclear receptor 
25 polypeptide fragments or functional portions of a nuclear receptor 
polypeptide. Such functional portion need not comprise all or substantially 
all of the amino acid sequence of a native nuclear receptor gene product. 
The term "functional" includes any biological activity or feature of nuclear 
receptor, including immunogenicity. 
30 The present invention also includes longer sequences of a nuclear 

receptor polypeptide, or portion thereof. For example, one or more amino 
acids can be added to the N-terminus or C-terminus of a nuclear receptor 
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polypeptide. Fusion proteins comprising nuclear receptor polypeptide 
sequences are also provided within the scope of the present invention. 
Methods of preparing such proteins are known in the art. 

The present invention also encompasses functional analogs of a 
5 nuclear receptor polypeptide. Functional analogs share at least one 
biological function with a nuclear receptor polypeptide. An exemplary 
function is immunogenicity. In the context of amino acid sequence, 
biologically functional analogs, as used herein, are peptides in which certain, 
but not most or all, of the amino acids can be substituted. Functional 

10 analogs can be created at the level of the corresponding nucleic acid 
molecule, altering such sequence to encode desired amino acid changes. In 
one embodiment, changes can be introduced to improve a biological 
function of the polypeptide, e.g., to improve the antigenicity of the 
polypeptide. In another embodiment, a nuclear receptor polypeptide 

15 sequence is varied so as to assess the activity of a mutant nuclear receptor 
polypeptide. 

The present invention also encompasses recombinant production of 
the disclosed nuclear receptor polypeptides. Briefly, a nucleic acid 
sequence encoding a nuclear receptor polypeptide, or portion thereof, is 

20 cloned into an expression cassette, the cassette is introduced into a host 
organism, where it is recombinantly produced. 

The term "expression cassette" as used herein means a DNA 
sequence capable of directing expression of a particular nucleotide 
sequence in an appropriate host cell, comprising a promoter operatively 

25 linked to the nucleotide sequence of interest which is operatively linked to 
termination signals. It also typically comprises sequences required for 
proper translation of the nucleotide sequence. The expression cassette 
comprising the nucleotide sequence of interest can be chimeric. The 
expression cassette can also be one that is naturally occurring but has been 

30 obtained in a recombinant form useful for heterologous expression. 

The expression of the nucleotide sequence in the expression cassette 
can be under the control of a constitutive promoter or an inducible promoter 



WO 02/077157 



PCT/US02/11257 



-26- 

that initiates transcription only when the host cell is exposed to some 
particular external stimulus. Exemplary promoters include Simian virus 40 
early promoter, a long terminal repeat promoter from retrovirus, an action 
promoter, a heat shock promoter, and a metallothien protein. In the case of 
5 a multicellular organism, the promoter and promoter region can direct 
expression to a particular tissue or organ or stage of development. Suitable 
expression vectors which can be used include, but are not limited to, the 
following vectors or their derivatives: viruses such as vaccinia virus or 
adenovirus, baculovirus vectors, yeast vectors, bacteriophage vectors (e.g., 

10 lambda phage), plasmid and cosmid DNA vectors, and transposon-mediated 
transformation vectors. 

The term "host cell", as used herein, refers to a cell into which a 
heterologous nucleic acid molecule has been introduced. Transformed cells, 
tissues, or organisms are understood to encompass not only the end product 

15 of a transformation process, but also transgenic progeny thereof. 

A host cell strain can be chosen which modulates the expression of 
the inserted sequences, or modifies and processes the gene product in the 
specific fashion desired. For example, different host cells have characteristic 
and specific mechanisms for the translational and post-transactional 

20 processing and modification (e.g., glycosylation, phosphorylation of 
proteins). Appropriate cell lines or host systems can be chosen to ensure 
the desired modification and processing of the foreign protein expressed. 
Expression in a bacterial system can be used to produce a non-glycosylated 
core protein product. Expression in yeast will produce a glycosylated 

25 product. Expression in insect cells can be used to ensure "native" 
glycosylation of a heterologous protein. 

Expression constructs are transfected into a host cell by any standard 
method, including electroporation, calcium phosphate precipitation, DEAE- 
Dextran transfection, liposome-mediated transfection, transposon-mediated 

30 transformation and infection using a retrovirus. The nuclear receptor- 
encoding nucleotide sequence carried in the expression construct can be 
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stably integrated into the genome of the host or it can be present as an 
extrachromosomal molecule. 

Isolated polypeptides and recombinantly produced polypeptides can 
be purified and characterized using a variety of standard techniques that are 
5 known to the skilled artisan. See e.g., Ausubel et al. (1992) Current 
Protocols in Molecular Biology , John Wylie & Sons, Inc., New York, New 
York; Bodanszky, et al. (1976) Peptide Synthesis , John Wiley and Sons, 
Second Edition, New York, New York; and Zimmer et al. (1993) Peptides, 
pp. 393-394, ESCOM Science Publishers, B. V. 

10 I.C. Nucleotide and Amino Acid Sequence Comparisons 

The terms "identical" or percent "identity" in the context of two or more 
nucleotide or polypeptide sequences, refer to two or more sequences or 
subsequences that are the same or have a specified percentage of amino 
acid residues or nucleotides that are the same, when compared and aligned 

15 for maximum correspondence, as measured using one of the sequence 
comparison algorithms disclosed herein or by visual inspection. 

The term "substantially identical" in regards to a nucleotide or 
polypeptide sequence means that a particular sequence varies from the 
sequence of a naturally occurring sequence by one or more deletions, 

20 substitutions, or additions, the net effect of which is to retain at least some of 
biological activity of the natural gene, gene product, or sequence. Such 
sequences include "mutant" sequences, or sequences wherein the biological 
activity is altered to some degree but retains at least some of the original 
biological activity. The term "naturally occurring", as used herein, is used to 

25 describe a composition that can be found in nature as distinct from being 
artificially produced by man. For example, a protein or nucleotide sequence 
present in an organism, which can be isolated from a source in nature and 
which has not been intentionally modified by man in the laboratory, is 
naturally occurring. 

30 For sequence comparison, typically one sequence acts as a reference 

sequence to which test sequences are compared. When using a sequence 
comparison algorithm, test and reference sequences are entered into a 
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computer program, subsequence coordinates are designated if necessary, 
and sequence algorithm program parameters are selected. The sequence 
comparison algorithm then calculates the percent sequence identity for the 
designated test sequence(s) relative to the reference sequence, based on 
5 the selected program parameters. 

Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith & Waterman (1981) Adv Appl 
Math 2:482, by the homology alignment algorithm of Needleman & Wunsch 
(1970) J Mol Biol 48:443, by the search for similarity method of Pearson & 

10 Lipman (1988) Proc Natl Acad Sci USA 85:2444-2448, by computerized 
implementations of these algorithms (GAP, BESTFIT, FASTA, and T FAST A 
in the Wisconsin Genetics Software Package, Genetics Computer Group, 
Madison, Wisconsin), or by visual inspection. See generally, Ausubel et al. 
(1992) Current Protocols in Molecular Biology , John Wylie & Sons, Inc., New 

15 York, New York. 

A preferred algorithm for determining percent sequence identity and 
sequence similarity is the BLAST algorithm, which is described in Altschul et 
al. (1990) J Mol Biol 21 5: 403-410. Software for performing BLAST analyses 
is publicly available through the National Center for Biotechnology 

20 Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of 
length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned with a word of the same length in a 
database sequence. T is referred to as the neighborhood word score 

25 threshold. These initial neighborhood word hits act as seeds for initiating 
searches to find longer HSPs containing them. The word hits are then 
extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are 
calculated using, for nucleotide sequences, the parameters M (reward score 

30 for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring 
matrix is used to calculate the cumulative score. Extension of the word hits 
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in each direction are halted when the cumulative alignment score falls off by 
the quantity X from its maximum achieved value, the cumulative score goes 
to zero or below due to the accumulation of one or more negative-scoring 
residue alignments, or the end of either sequence is reached. The BLAST 
5 algorithm parameters W, T, and X determine the sensitivity and speed of the 
alignment. The BLASTN program (for nucleotide sequences) uses as 
defaults a wordlength W=1 1 , an expectation E=10, a cutoff of 100, M=5, N=- 
4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) 

10 of 10, and the BLOSUM62 scoring matrix. See Henikoff & Henikoff (1989) 
Proc Natl Acad Sci USA 89:10915. 

In addition to calculating percent sequence identity, the BLAST 
algorithm also performs a statistical analysis of the similarity between two 
sequences. See e.g., Karlin & Altschul (1993) Proc Natl Acad Sci USA 

15 90:5873-5887. One measure of similarity provided by the BLAST algorithm 
is the smallest sum probability (P(N)), which provides an indication of the 
probability by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a test nucleic acid 
sequence is considered similar to a reference sequence if the smallest sum 

20 probability in a comparison of the test nucleic acid sequence to the reference 
nucleic acid sequence is less than about 0.1, more preferably less than 
about 0.01 , and most preferably less than about 0.001 . 
I.D. Antibodies 

Also provided is an antibody that specifically binds an insect nuclear 
25 receptor polypeptide of the present invention. The term "antibody" indicates 
an immunoglobulin protein, or functional portion thereof, including a 

* 

polyclonal antibody, a monoclonal antibody, a chimeric antibody, a single 
chain antibody, Fab fragments, and a Fab expression library. "Functional 
portion" refers to the part of the protein that binds a molecule of interest. In a 
30 preferred embodiment, an antibody of the invention is a monoclonal 
antibody. Techniques for preparing and characterizing antibodies are known 
in the art. See e.g., Harlow & Lane (1988) Antibodies: A Laboratory Manual , 
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Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. A 
monoclonal antibody of the present invention can be readily prepared 
through use of well-known techniques such as the hybridoma techniques 
exemplified in U.S. Patent No 4,196,265 and the phage-displayed 
5 techniques disclosed in U.S. Patent No. 5,260,203. 

The phrase "specifically (or selectively) binds to an antibody", or 
"specifically (or selectively) immunoreactive with", when referring to a protein 
or peptide, refers to a binding reaction which is determinative of the 
presence of the protein in a heterogeneous population of proteins and other 

10 biological materials. Thus, under designated immunoassay conditions, the 
specified antibodies bind to a particular protein and do not show significant 
binding to other proteins present in the sample. Specific binding to an 
antibody under such conditions can require an antibody that is selected 
based on its specificity for a particular protein. For example, antibodies 

15 raised to a protein with an amino acid sequence encoded by any of the 
nucleic acid sequences of the invention can be selected to obtain antibodies 
specifically immunoreactive with that protein and not with unrelated proteins. 

The use of a molecular cloning approach to generate antibodies, 
particularly monoclonal antibodies, and more particularly single chain 

20 monoclonal antibodies, are also provided. The production of single chain 
antibodies has been described in the art. See e.g., U.S. Patent No. 
5,260,203. For this approach, combinatorial immunoglobulin phagemid 
libraries are prepared from RNA isolated from the spleen of the immunized 
animal, and phagemids expressing appropriate antibodies are selected by 

25 panning on tissue that expresses the polypeptide. The advantages of this 
approach over conventional hybridoma techniques are that approximately 
10 4 times as many antibodies can be produced and screened in a single 
round, and that new specificities are generated by heavy (H) and light (L) 
chain combinations in a single chain, which further increases the chance of 

30 finding appropriate antibodies. Thus, an antibody of the present invention, 
or a "derivative" of an antibody of the present invention, pertains to a single 
polypeptide chain binding molecule which has binding specificity and affinity 
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substantially identical to the binding specificity and affinity of the light and 
heavy chain aggregate variable region of an antibody described herein. 

The term "immunochemical reaction", as used herein, refers to any of 
a variety of immunoassay formats used to detect antibodies specifically 
5 bound to a particular protein, including but not limited to competitive and 
non-competitive assay systems using techniques such as 
radioimmunoassays, ELISA (enzyme linked immunosorbent assay), 
"sandwich" immunoassays, immunoradiometric assays, gel diffusion 
precipitation reactions, immunodiffusion assays, in situ immunoassays (e.g., 

10 using colloidal gold, enzyme or radioisotope labels), Western blot analysis, 
precipitation reactions, agglutination assays (e.g., gel agglutination assays, 
hemagglutination assays), complement fixation assays, immunofluorescence 
assays, protein A assays, and immunoelectrophoresis assays, etc. See 
Harlow & Lane (1988) Antibodies: A Laboratory Manual , Cold Spring Harbor 

15 Laboratory Press, Cold Spring Harbor, New York for a description of 
immunoassay formats and conditions. 
I.E. Transgenic Organisms 

It is also within the scope of the present invention to prepare a 
transgenic organism to express a transgene comprising nucleic acid 

20 sequences of the present invention. The term "transgenic organism", 
indicates an organism comprising a germline insertion of a heterologous 
nucleic acid. A transgenic organism can be an animal or a plant. 
Transgenic organisms of the present invention are understood to encompass 
not only the end product of a transformation method, but also transgenic 

25 progeny thereof. 

The term "transgene", as used herein indicates a heterologous nucleic 
acid molecule that has been transformed into a host cell. For intended use 
in the creation of a transgenic organism, the transgene can include genomic 
sequences of the host organism at a selected locus or site of transgene 

30 integration to mediate a homologous recombination event. A transgene 
further comprises nucleic acid sequences of interest, for example a targeted 
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modification of the gene residing within the locus, a reporter gene, or a 
expression cassette, each defined herein above. 
1L Nuclear Receptors 

II.A. Conserved Features 
5 The steroid/nuclear receptor superfamily comprises soluble receptor 

proteins that function as ligand-inducible transcription factors. Nuclear 
receptor polypeptides are characterized by the presence of five or six 
evolutionarily conserved domains: A/B, C, D, E and F (Evans (1988) Science 
240:889-895; Laudet et al. (1992) EMBO J 11:1003-1013). Various 

10 functions are ascribed to each domain, e.g., "A/B" refers to the 
transactivation domain, "C" refers to the DNA binding domain, "D" refers to 
the hinge/linker domain, "E" refers to the ligand binding domain, and "F" 
refers to the variable C-terminal domain that is present in some receptor 
polypeptides. In general, the domains can be modular in that the function of 

15 an individual domain is preserved in the context of a chimeric protein. See 
e.g., Green & Chambon (1987) Nature 325:75-78; Green et al. (1988) EMBO 
J 7:3037-3044; Giguere et al. (1987) Nature 330:624-629. 

The "A/B" region is of variable size and poorly conserved. In some 
cases, the A/B region has a transcriptional activation function. See Ptashne 

20 (1988) Nature 335:683-689; Hadzic et al. (1995) Mol Cell Biol 15:4507-4517; 
Pakdel et al. (1993) Mol Endocrinol 7:1408-1417; Thompson et al. (1989) 
Proc Natl Acad Sci USA 86:3493-3498. Amino acids that confer a 
transcriptional activation function can facilitate repeated transcription 
initiation events leading to greater levels of gene expression from a target 

25 gene. 

The "C" (DNA binding) domain is a highly conserved region of 
approximately 198 nucleotides that encode an approximately 66 amino acid 
and polypeptide that comprises two Cys2-Cys2 zinc finger DNA binding 
motifs (Danielsen et al. (1989) Cell 57:1 131-1 138; Green etal. (1988) EMBO 
30 J 7:3037-3044; Umesono & Evans (1989) Cell 57:1139-1146). The DNA- 
binding domain also facilitates receptor dimerization (Perlmann et al. (1993) 
Genes Dev 7:141 1-1422; Zechel et al. (1994) EMBO J 13:1414-1424; Mader 
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et al. (1993) EMBO J 12:5029-5041; Hard et al. (1990) Science 249:397- 
404). CH4 The term "core DNA binding domain" is defined as the 66 amino 
acid sequence that generally begins with a conserved CYS residue and 
ends with conserved residues GLU-MET. See e.g., Rastinejad (1998) in 
5 Freedman, ed, Molecular Biology of Steroid and Nuclear Hormone 
Receptors , pp. 107, Birkhauser, Boston, Massachusetts. The DNA binding 
domain is also characterized by a conserved three-dimensional fold when 
backbone atoms are compared (Rastinejad et al. (1995) Nature 375:203- 
211). 

10 The affinity of the DNA binding domain for a DNA recognition site can 

be influenced by residues that are N-terminal or C-terminal to the core DNA- 
binding domain. See e.g., Ueda et al. (1992) Mol Cell Biol 12:5667-5672; 
Rastinejad (1998) in Freedman, ed, Molecular Biology of Steroid and 
Nuclear Hormone Receptors , pp. 103-131, Birkhauser, Boston, 

15 Massachusetts and references cited therein. In particular, the term "P Box" 
refers to sequences that are adjacent to the C-terminal end of the DNA 
binding domain that facilitate binding to the 5' end, generally an A/T-rich 
sequence of a target response element. 

The "D" (hinge/linker) domain is located between the DNA binding 

20 domain and the ligand binding domain and can contribute to the strength 
and specificity of DNA-binding. See e.g., Ueda et al. (1992) Mol Cell Biol 
12:5667-5672; Rastinejad (1998) in Freedman, ed, Molecular Biology of 
Steroid and Nuclear Hormone Receptors , pp. 103-131, Birkhauser, Boston, 
Massachusetts and references cited therein. In some cases, an extended 

25 DNA binding site that include sequences within the hinge domain can enable 
monomeric binding (e.g., Ueda et al., (1992) Mol Cell Biol 112:5667-5672; 
Wilson et al. (1992) Science 256:106-110). The hinge region is also 
implicated in conferring flexibility between the ligand and DNA binding 
domains (Bourguet et al. (1995) Nature 375:377-382; Wagner et al. (1995) 

30 Nature 378:690-697). 

The "E" (ligand binding) domain, as used herein, is interchangeably 
referred to as the "ligand binding domain" or the "hormone binding domain". 
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The ligand binding domain comprises a hydrophobic pocket that enables 
regulation of nuclear receptor activity by small chemical ligands. Unlike the 
DNA binding domain, the ligand binding domain is not clearly delineated by 
amino acid sequence. However, the general position of the ligand binding 
5 domain is conserved at the carboxyl end of the protein. The ligand binding 
domain is further characterized by a conserved tertiary structure (Wurtz et al. 
(1 996) Nat Struct Biol 3:87-94). 

The functional ligand binding domain within the carboxyl region of a 
nuclear receptor polypeptide is operationally defined as the amino acids 

10 required for high affinity binding of any ligand and can be determined 
according to methods known in the art. See e.g., Rusconi & Yamamoto 
(1987) EMBO J 6:1309-1315; Zhang et al. (1996a) Mol Endocrinol 10:24-34; 
Lanz & Rusconi (1994) Endocrinology 135:2183-2194; Xu et al. (1996) J Biol 
Chem 271:21430-21438; Zhang et al. (1996b) J Biol Chem 271:14825- 

15 14833). 

The ligand binding domain also mediates receptor dimerization. See 
Simons in Freedman, ed, Molecular Biology of Steroid and Nuclear Hormone 
Receptors , pp. 35-104, Birkhauser, Boston, Massachusetts, and references 
cited therein. A series of heptad repeats in the ligand binding domain, 

20 specifically helix 10 with some contribution from helix 9, forms the main 
dimer interface (Forman et al. (1989) Mol Endocrinol 3: 161 0-1 626; Forman & 
Samuels (1990) Mol Endocrinol 4: 1293-1 301 ; Bourguet et al. (1995b) Nature 
375:377-382; Wurtz et al. (1 996) Nat Struct Biol 3:87-94). 

II. B. Identification of Novel Drosophila Nuclear Receptors 

25 The present invention provides novel Drosophila nuclear receptor 

nucleic acid and polypeptide sequences. Preferably, a Drosophila nuclear 
receptor nucleic acid molecule of the present invention comprises the 
sequence set forth as any one of the SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, 
and 25; or a nucleic acid molecule that is substantially identical to any one of 

30 SEQ ID NOs:1, 5, 9, 13, 17, 19, 21, 23, and 25. Also preferably, a nuclear 
receptor polypeptide of the present invention comprises an amino acid 
sequence set forth as any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, 
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and 26; or a polypeptide that is substantially identical to any one of SEQ ID 
NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26. 

To identify new Drosophila proteins, a database of predicted proteins 
(referred to herein as "the GeneMark database") was assembled using the 
5 GeneMark program (Borodovsky & Mclninch (1993) Computers & Chemistry 
17:123-133) and template 50 kb genomic sequence scaffolds generated by 
Celera Corporation (Rockville Maryland). A profile Hidden Markov Model 
(HMM) was built using Drosophila and C. elegans sequences as described 
in Example 1. The profile HMM was used to query predicted protein 

10 databases, including the GeneMark database and a predicted protein 
database generated by Celera using an alternative protein prediction 
program (referred to herein as "the Celera database"). 

Three new nuclear receptor sequences were identified (SEQ ID NOs:2, 
6, and 10) in the GeneMark database, and three similar sequences were 

15 identified in the Celera database (SEQ ID NOs:4, 8, and 12) (Figure 1). A 
fourth nuclear receptor, which was designated DHR4 based on its close 
homology to Tenbrio and Manduca THR4 sequences, was variably predicted 
based on the sequences of both databases as well as other genomic clones 
(Accession No. AL035245). In contrast to the predicted cDNA and protein 

20 sequences, the DHR4 nucleotide sequence disclosed herein (SEQ ID 
NO:13) comprises an isolated DHR4 cDNA, which encodes a DHR4 protein 
(SEQ ID NO: 14) that is different than the predicted protein sequences noted 
herein above. 

Nuclear receptor cDNAs that encode the novel nuclear receptors 
25 identified in the GeneMark database were predicted by using the predicted 

receptor polypeptides to perform a TBLASTN search the genomic sequence. 

Nucleotide sequences retrieved by this search were assembled as the 

predicted nuclear receptor cDNAs. The corresponding genes were 

subsequently cloned as described in Example 2. 
30 The novel Drosophila nuclear receptor sequences were named 

according to the most closely related nuclear receptor as shown in Table 2. 
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II. C. Identification of New Heliothis Nuclear Receptors 
To identify new nuclear receptors in a pest insect, Heliothis virescens 
(hereinafter "Heliothis"), a cDNA library derived from Heliothis transcripts 
5 was screened using a mixture of labeled Drosophila nuclear receptor 
sequences as probe, as described in Example 3. Additional Heliothis 
nuclear receptor sequences were obtained by PCR using degenerate 
primers designed according to Drosophila nuclear receptor sequences, as 
described in Example 4. Heliothis nuclear receptor fragments derived from 
10 both methods were assembled by recognition of overlapping sequence. 
New Heliothis nuclear receptors were named based on the most closely 
related insect nuclear receptor (Table 3 and Figures 2-4). Heliothis FTZ-F1 
is most similar to the p isoform encoded by Drosophila FTZ-F1, pFTZ-F1 , in 
that it lacks an extended N-terminal domain characteristic of the aFTZ-F1 
15 isoform. 
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111. Functional Analysis of Drosophila melanoaaster Nuclear Receptors 

Many insect pests inflict plant damage by the feeding activity during 
larval stages. Therefore, functional analyses to assess phenotypes 
associated with modulation of a nuclear receptor during larval development 
can be used to identify candidate insecticide targets. The present invention 
discloses nuclear receptors whose regulation is relevant to larval viability. 
Thus, modulators that alter nuclear receptor activity in a manner analogous 
to the changes resulting from genetic manipulations described herein below, 
can be useful as insecticide compositions. 
III.A. Loss-of- Function Analyses 

RNA-mediated interference (RNAi) is a recently discovered method to 
determine gene function in a number of organisms, wherein double-stranded 
RNA (dsRNA) directs gene-specific, post-transcriptional silencing. See e.g., 
Kuwabara & Olson (2000) Parasitol Today 16(8):347-349; Bass (2000) Cell 
101(3):235-238; Hunter (2000) Curr Biol 10(4):R137-140; Bosher & 
Labouesse (2000) Nat Cell Biol 2(2):E31-36; Sharp (1999) Genes Dev 
1 3(2):1 39-141 . The double-stranded RNA molecule can be synthesized in 
vitro and then introduced into the organism by injection or other methods. 
Alternatively, a heritable transgene exhibiting dyad symmetry can provide a 
transcript that folds as a hairpin structure. Methods for examining gene 
functions using dsRNAi in Drosophila are disclosed in Example 5 and further 
in Kennerdell & Carthew (2000) Nat Biotech 18(8):896-898; Lam & Thummel 
(2000) Curr Biol 10(16):957-963; Misquitta & Paterson (1999) Proc Natl 
Acad SciUSA 96 (4): 1451 -1456. 

The present invention discloses RNA-mediated interference of 
Drosophila nuclear receptors DERR (SEQ ID NO:2), DFAX1 (SEQ ID NO:6), 
DFAX2 (SEQ ID NO:10), DHR4 (SEQ ID NO:14), DSF (GenBank Accession 
No. AF10667, SEQ ID NO: 16), FTZ-F1 (GenBank Accession M98387), 
EGON (GenBank Accession No. D43634), and DHR3 (GenBank Accession 
No. M90806) (Figure 5). Double-stranded RNA complementary to each 
nuclear receptor sequence was synthesized in vitro and injected into early 
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Drosophila embryos, as described in Example 5. Development of injected 
embryos was assessed by scoring: (a) morphological criteria using a light 
microscope (Campos-Ortega & Hartenstein (1985) The Embryonic 
Development of Drosophila melanoctasten Springer-Verlag, Berlin), (b) 
embryo hatching to become a larvae, (c) puparium formation, and (d) 
eclosion of the pupae as an adult fly, as indicated in Table 4 herein below. 
Buffer-injected embryos were injected and monitored in parallel as a control. 
The percentage of embryos injected with dsRNA that survive to the adult 
stage is depicted in Figure 5. 
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Injection of double-stranded DHR3, FTZ-F1 , and EGON RNA 
conferred significant embryonic lethality (92% and 94% respectively). 
Embryos injected with double-stranded EGON RNA also showed lethality 
during larval stages (71%). The observed lethal phases resulting from loss 
5 of DHR3, FTZ-F1, and EGON, as determined using dsRNAi, are consistent 
with published loss-of-f unction phenotypes for DHR3 (Carney et al. (1997) 
Proc Natl Acad Sci USA 94(22): 12024-1 2029), FTZ-F1 (Yu et al. (1997) 
Nature 385:552-555; Guichet et al. (1997) Nature 385:548-552), and EGON 
(Dittrich et al. (1997) Development 124(1 3):251 5-2525). 

10 Embryos injected with double-stranded DHR4 or DSF RNA showed 

significant embryonic lethality (98% and 85%, respectively). DHR4 and DSF 
can also be required during larval development as many genes are essential 
at multiple developmental stages. Injection of double-stranded DERR RNA 
resulted in lethality predominantly during larval stages (61%). By contrast, 

15 injection of double-stranded DFAX1 or DFAX2 RNA showed some lethality 
during embryonic stages, although not substantially different than buffer- 
injected control animals. 

Lethality resulting from loss of nuclear receptor function is predicted to 
be mimicked by provision of an antagonist substance that specifically binds a 

20 given receptor. Nuclear receptor antagonists can be identified by methods 
known in the art and as further disclosed in the section entitled Identification 
of Insect Nuclear Receptor Modulators , herein below. The essentiality of 
nuclear receptors DHR4, DSF, and DERR, disclosed herein for the first time, 
identifies the utility of antagonists that block or mitigate the activity of DHR4, 

25 DSF, and DERR as insecticides. 

III.B. Gain-of-Function Analyses 

Ectopic expression systems have been used to elucidate gene 
function when classical loss-of-function genetics is not straightforward. For 
example, heat-induced expression of spaghetti squash, which encodes the 
30 nonmuscle myosin II regulatory light chain, can effectively rescue the early 
lethality of spaghetti squash mutants, facilitating the analysis of phenotypes 
later in development (Edwards & Kiehart (1996) Development 122:1499). 
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Similarly, dominant phenotypes generated by overexpressing a gene of 
interest have been used to address post-embryonic gene functions, 
particularly in cases where gene mutation results in embryonic lethality. See 
e.g., Lam et al. (1999) Dev Biol 212(1):204-216; Woodard et al. (1994) Cell 
5 79(4):607-615). 

Transgenic methods for ectopic expression in Drosophila utilize 
promoters that drive either constitutive or regulated expression of the gene 
of interest. Constructs designed for ectopic expression can be prepared in a 
transformation vector, and are introduced into the fly genome by germ line 
10 transformation. A transgenic line is established, and ectopic expression of 
the gene of interest can be analyzed in a wild type or mutant genetic 
background. 

In one embodiment, a heat shock promoter can be used to temporally 
regulate gene expression (Lis et al. (1983) Cell 35:403; Struhl (1985) Nature 

15 318:677; Schneuwly et al. (1987) Nature 325:816). Using this approach, the 
level of ectopic gene expression can be easily modulated by altering the 
temperature and/or duration of the heat treatment. 

Overexpression of nuclear receptors can reveal the role of an 
activated nuclear receptor. Provision of ligand and/or apo-receptor (a 

20 receptor not bound by ligand) favors formation of the liganded receptor. 
Similarly, provision of excess nuclear receptor overexpression can also lead 
to an excess of active receptor. See e.g., Tsai et al. (1998) in Wilson et al., 
eds, Williams Textbook of Endocrinology , pp. 55-94, W.B. Saunders 
Company, Philadelphia, Pennsylvania, and references cited therein. To 

25 create this situation in vivo, a nuclear receptor can be overexpressd using a 
heterologous transgene. This strategy enables a functional assessment of 
orphan nuclear receptors, wherein a ligand has not yet been identified. A 
phenotype observed following nuclear receptor overexpression is predicted 
to also be generated by abnormally elevated levels of endogenous ligand or 

30 by administration of a nuclear receptor agonist. 

The present invention discloses overexpression of nuclear receptors 
during Drosophila larval development. Transgenic Drosophila lines were 
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employed that carry heat inducible nuclear receptor transgenes, as 
described in Example 6. During Drosophila development, the first larval 
instar begins with hatching of the embryo, and culminates with the first larval 
molt at approximately 24 hours after hatching (25°C). Transgenic larvae 
5 were briefly heat treated (1 hr at 37°C) at the beginning of larval 
development (0-2 hrs +/- 15 min) or alternatively at the end of larval 
development (20-22 hrs +/- 15 min), immediately prior to molting. Control 
experiments omitted heat treatment. The genotype from which the 
transgenic lines are derived, w 1118 , were treated in parallel experiments as 

10 an additional control. The developmental progress of larvae was monitored 
at 24 hours following heat treatment and at puparium formation. 

Heat-induced expression of DHR38, DHR39, and E75A at the end of 
the first instar also resulted in larval lethality (Table 5, Figure 6). By contrast, 
transgenic larvae that were not heat-treated were substantially viable, 

15 demonstrating that the presence of the transgene, in the absence of induced 
nuclear receptor expression, is not responsible for the observed lethality. 
Further, transgenic larvae that were heat treated at the beginning of first 
instar larval development, w 1118 larvae that were heat treated either at the 
beginning or end of larval development, and non-heat-treated w 1118 larvae 

20 were also substantially viable. Thus, the lethality observed in transgenic 
larvae that were heat treated just prior to molting (20-22 hr) is attributable to 
induced nuclear receptor expression, and is not a consequence of heat 
treatment alone. 

Table 5. Percent Viability of Heat-Treated Larvae 

25 



larvae 


heat treatment 


none 


0-2 hr 


20-22 hr 


control 


88% 


76% 


88% 


hs-DHR38 


92% 


86% 


16% 


hs-DHR39 


90% 


78% 


12% 


hs-E75A 


84% 


72% 


26% 
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Examination of lethal larvae that were heat treated to induce 
expression of DHR39 revealed the presence of multiple mouthhooks (28%). 
This phenotype is consistent with a defect in epidermal molting, whereby 
mouthhooks are normally expelled (Perrimon et al. (1985) Genetics 11:23- 
5 41; Demerec (1965) Biology of Drosophila , Hafner Publishing, New York, 
New York. 

Lethality resulting from overexpression of DHR38, DHR39, and E75A 
is predicted to be mimicked by provision of agonists that bind these 
receptors. Nuclear receptor agonists can be identified by methods known in 

10 the art and as further disclosed in the section entitled Identification of Insect 
Nuclear Receptor Modulators , herein below. Prior to the disclosure of the 
subject application, the larval lethal phenotypes conferred by overexpression 
of DHR38, DHR39, and E75A was unknown. The phenotypic 
characterization of nuclear receptor modulation during larval development, 

15 disclosed herein, identifies the utility of nuclear receptor agonists that 
activate DHR38, DHR39, E75A, and homologues of the indicated nuclear 
receptors as insecticides. 

Gain-of-function phenotypes of new Drosophila nuclear receptors 
DERR (SEQ ID NO:2), DFAX1 (SEQ ID NO:6), and DFAX2 (SEQ ID NO:10) 

20 disclosed herein can be addressed using ectopic expression techniques in 
Drosophila that are known in the art. The present invention provides 
nucleotide sequences (SEQ ID NOs:1, 5, and 9) encoding such receptors 
that can be used to construct vectors for ectopic expression. 
IV. Recombinant Expression of Insect Nuclear Receptors 

25 For recombinant production of a protein of the invention in a host 

organism, a nucleotide sequence encoding the protein is inserted into an 
expression cassette designed for the chosen host and introduced into the 
host where it is recombinantly produced. The choice of the specific 
regulatory sequences such as promoter, signal sequence, 5' and 3' 

30 untranslated sequence, and enhancer appropriate for the chosen host is 
within the level of ordinary skill in the art. The resultant molecule, containing 
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the individual elements linking in the proper reading frame, is inserted into a 
vector capable of being transformed into the host cell. 

Expression constructs can be transfected into a host cell by a 
standard method suitable for the selected host, including electroporation, 
5 calcium phosphate precipitation, DEAE-Dextran transfection, liposome- 
mediated transfection, infection using a retrovirus, transposon-mediated 
transfer, and particle bombardment techniques. The expression cassette 
sequence carried in the expression construct can be stably integrated into 
the genome of the host or it can be present as an extrachromosomal 
10 molecule. 

Suitable expression vectors and methods for recombinant production 
of proteins are known for host organisms such as E. co//, yeast, and insect 
cells. See e.g., Lucknow & Summers (1988) Bio/Technol 6:47. 
Representative methods for recombinant production of an insect nuclear 
1 5 receptor in E. coli are disclosed in Example 7. 

Additional suitable expression vectors are baculovirus expression 
vectors, e.g., those derived from the genome of Autographies californica 
nuclear polyhedrosis virus (AcMNPV). A preferred baculovirus/insect 
system is PVL1392/PVL1393 used to transfect Spodoptera frugiperda (SF9) 
20 cells in the presence of linear Autographica californica baculovirus DNA 
(Pharmingen of San Diego, California). The resulting virus is used to infect 
HighFive Trichoplusia ni cells (Invitrogen Corporation of Carlsbad, 
California). Representative methods for recombinant production of an insect 
nuclear receptor in insect cells are disclosed in Example 8. 

25 Recombinantly produced proteins can be isolated and purified using a 

variety of standard techniques. The actual techniques used varies 
depending upon the host organism used, whether the protein is designed for 
secretion, and other such factors. Such techniques are known to the skilled 
artisan. See Ausubel et al. (1992) Current Protocols in Molecular Biology , 

30 John Wylie and Sons, Inc., New York, New York. 
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The present invention further encompasses recombinant expression 
of the disclosed insect nuclear receptors, or portion thereof, in plants, as 
described further herein below under the section entitled Transgenic Plants . 

V. Production of Insect Nuclear Receptor Antibodies 

5 In another aspect, the present invention provides a method of 

producing an antibody immunoreactive with an insect nuclear receptor 
polypeptide, the method comprising recombinantly or synthetically producing 
an insect nuclear receptor polypeptide, or portion thereof, to be used as an 
antigen. The insect nuclear receptor polypeptide is formulated so that it is 

10 can be used as an effective immunogen. An animal is immunized with the 
formulated insect nuclear receptor polypeptide to generate an immune 
response in the animal. The immune response is characterized by the 
production of antibodies that can be collected from the blood serum of the 
animal. Optionally, cells producing an insect nuclear receptor antibody can 

15 be fused with myeloma cells, whereby a monoclonal antibody can be 
selected. Exemplary methods for producing a monoclonal antibody that 
recognizes an insect nuclear receptor protein are described in Example 4. 
Preferred embodiments of the method use a polypeptide set forth as any one 
of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26. 

20 The present invention also encompasses antibodies and cell lines that 

produce monoclonal antibodies as described herein. 

The foregoing antibodies can be used in methods known in the art 
relating to the localization and activity of the insect nuclear receptor 
polypeptide sequences of the invention, e.g., for cloning of insect nuclear 

25 receptor nucleic acids, immunopurification of insect nuclear receptor 
polypeptides, imaging insect nuclear receptor polypeptides in a biological 
sample, and measuring levels thereof in appropriate biological samples. 

VI. Methods for Detecting an Insect Receptor Nucleic Acid 

In another aspect of the invention, a method is provided for detecting 
30 a nucleic acid molecule that encodes an insect nuclear receptor polypeptide. 
Such methods can be used to detect insect nuclear receptor gene variants 
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and related resistance gene sequences. The disclosed methods facilitate 
genotyping, cloning, gene mapping, and gene expression studies. 

The nucleic acids of the present invention can be used to clone genes 
and genomic DNA comprising the disclosed sequences. Alternatively, the 
5 nucleic acids of the present invention can be used to clone genes and 
genomic DNA of related sequences, preferably nuclear receptor genes in 
pest insects and nematodes. Using the nucleic acid sequences disclosed 
herein, such methods are known to one skilled in the art. See, for example, 
Sambrook et al., eds (1989) Molecular Cloning , Cold Spring Harbor 

1 0 Laboratory Press, Cold Spring Harbor, New York. Representative methods 
are also disclosed in Examples 3 and 4. Preferably, the nucleic acids used 
for this method comprise sequences set forth as any one of SEQ ID NOs:1, 
5, 9, 13, 17, 19, 21, 23, and 25. 

In one embodiment, genetic assays based on nucleic acid molecules 

15 of the present invention can be used to screen for genetic variants by a 
number of PCR-based techniques, including single-strand conformation 
polymorphism (SSCP) analysis (Orita et al. (1989) Proc Natl Acad Sci USA 
86(8):2766-2770), SSCP/heteroduplex analysis, enzyme mismatch 
cleavage, direct sequence analysis of amplified exons (Kestila et al. (1998) 

20 Mol Cell 1(4):575-582; Yuan et al. (1999) Hum Mutat 14(5):440-446), allele- 
specific hybridization (Stoneking et al. (1991) Am J Hum Genet 48(2):370- 
82), and restriction analysis of amplified genomic DNA containing the 
specific mutation. Automated methods can also be applied to large-scale 
characterization of single nucleotide polymorphisms (Brookes (1999) Gene 

25 234(2):177-186; Wang et al. (1998) Science 280(5366):1 077-1 082). 
Preferred detection methods are non-electrophoretic, including, for example, 
the TAQMAN™ allelic discrimination assay, PCR-OLA, molecular beacons, 
padlock probes, and well fluorescence. See Landegren et al. (1998) 
Genome Res 8:769-776. 

30 VII. Methods for Detecting an Insect Nuclear Receptor Polypeptide 

In another aspect of the invention, a method is provided for detecting 
a level of insect nuclear receptor polypeptide using an antibody that 
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specifically recognizes an insect nuclear receptor polypeptide, or portion 
thereof. In a preferred embodiment, biological samples from an 
experimental subject and a control subject are obtained, and insect nuclear 
receptor polypeptide is detected in each sample by immunochemical 
5 reaction with the insect nuclear receptor antibody. More preferably, the 
antibody recognizes amino acids of any one of SEQ ID NOs:2, 6, 10, 14, 18, 
20, 22, 24, and 26; and is prepared according to a method of the present 
invention for producing such an antibody. 

In one embodiment, an insect nuclear receptor antibody is used to 

10 screen a biological sample for the presence of an insect nuclear receptor 
polypeptide. A biological sample to be screened can be a biological fluid 
such as extracellular or intracellular fluid, or a cell or tissue extract or 
homogenate. A biological sample can also be an isolated cell (e.g., in 
culture) or a collection of cells such as in a tissue sample or histology 

15 sample. A tissue sample can be suspended in a liquid medium or fixed onto 
a solid support such as a microscope slide. In accordance with a screening 
assay method, a biological sample is exposed to an antibody 
immunoreactive with an insect nuclear receptor polypeptide whose presence 
is being assayed, and the formation of antibody-polypeptide complexes is 

20 detected. Techniques for detecting such antibody-antigen conjugates or 
complexes are known in the art and include but are not limited to 
centrifugation, affinity chromatography and the like, and binding of a labeled 
secondary antibody to the antibody-candidate receptor complex. 

A modulator that shows specific binding to an insect modulator can 

25 also be used to detect an insect nuclear receptor. Representative 
techniques for assaying specific binding include are described herein above 
under the heading "Identification of Insect Nuclear Receptor Modulators". 

The disclosed methods for detecting an insect nuclear receptor 
polypeptide can be useful to determine altered levels of gene expression 

30 that are associated with particular conditions such as enhanced tolerance to 
insecticides that target a particular insect nuclear receptor polypeptide. 
VIII. Identification of Nuclear Receptor Modulators 
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The present invention further discloses a method for identifying a 
compound that modulates an insect nuclear receptor. As used herein, the 
terms "candidate substance" and "candidate compound" are used 
interchangeably and refer to a substance that is believed to interact with 
5 another moiety, wherein a biological activity is modulated. For example, a 
representative candidate compound is believed to interact with an insect 
nuclear receptor polypeptide, or fragment thereof, and can be subsequently 
evaluated for such an interaction. Exemplary candidate compounds that can 
be investigated using the methods of the present invention include, but are 

10 not restricted to, viral epitopes, peptides, enzymes, enzyme substrates, co- 
factors, lectins, sugars, oligonucleotides or nucleic acids, oligosaccharides, 
proteins, chemical compounds, small molecules, and antibodies. A 
candidate compound to be tested can be a purified molecule, a homogenous 
sample, or a mixture of molecules or compounds. 

15 As used herein, the term "modulate" means an increase, decrease, or 

other alteration of any or all chemical and biological activities or properties of 
a wild-type insect nuclear receptor polypeptide, preferably an insect nuclear 
receptor polypeptide of any one of the even-numbered SEQ ID NOs:2-34. 
Preferably, an insect nuclear receptor modulator is an agonist of an insect 

20 nuclear receptor protein activity. As used herein, the term "agonist" means a 
substance that synergizes or potentiates the biological activity of a functional 
insect nuclear receptor protein. As used herein, the term "antagonist" refers 
to a substance that blocks or mitigates the biological activity of an insect 
nuclear receptor polypeptide. 

25 In accordance with the present invention there is also provided a rapid 

and high throughput screening method that relies on the methods described 
above. This screening method comprises separately contacting each 
compound with a plurality of substantially identical target molecules. In such 
a screening method the plurality of target molecules preferably comprises 

30 more than about 10 4 samples, or more preferably comprises more than 
about 5 x 10 4 samples. In an alternative high-throughput strategy, each 
target molecule can be contacted with a plurality of candidate compounds. 
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The disclosed methods can also be used to identify a modulator that 
interacts with an insect nuclear receptor ligand binding domain. Such 
assays can employ a target molecule comprising the full-length nuclear 
receptor polypeptide. Alternatively, an isolated ligand binding domain can 
5 be recombinantly produced for use in the assay. See Coffer et al. (1996) J 
Steroid Biochem Mol Biol 58:467-477; Tetel et al. (1997) Mol Endocrinol 
11:1114-1128; Rochel et al. (1997) Biochem Biophys Res Commun 
230:293-296). Optionally, additional sequences, such as receptor, GST, or 
polyhistidine sequences, can be fused to the amino terminal of the ligand 

10 binding domain to stabilize the conformation of the recombinantly expressed 
ligand binding domain. See e.g., Simental et al. (1991) J Biol Chem 
266:510-518; Nemoto et al. (1992) J Steroid Biochem Mol Biol 42:803-812; 
Cooper et al. (1996) J Steroid Biochem Mol Biol 57:251-257; Eul et al. 
(1 989) EMBO J 8:83-90; Lin et al. (1 991 ) Mol Endocrinol 5:485-492; Wagner 

15 et al. (1995) Nature 378:690-697; Dallery et al. (1993) Biochem 32:12428- 
12436; Lupisella et al. (1995) J Biol Chem 270:24884-24890; Rochel et al. 
(1997) Biochem Biophys Res Commun 230:293-296; Leng et al. (1995) Mol 
Cell Biol 15:255-263. 

In one embodiment, the disclosed methods for identifying modulators 

20 of insect nuclear receptors are performed using nuclear receptor sequences 
set forth as any one of even-numbered 2-34. In particular, the loss-of- 
function larval lethality that is observed when DHR4, DERR, or DSF function 
is disrupted (Figure 5) suggests that antagonists of DHR4, DERR, or DSF 
can be useful as insecticides. The larval lethal phenotype that is observed 

25 when DHR38, DHR39, or E75A is overexpressed in Drosophila (Figure 6) 
suggest that agonists of DHR38, DHR39, or E75A can be useful as 
insecticides. 

The disclosed methods for identifying modulators of insect nuclear 
receptors can be performed using nucleic acid sequences derived from a 
30 pest organism. The nuclear receptor sequences disclosed herein provide 
methods for identifying homologous sequences in pest species. Such 
techniques are well know to those in the art. See for example, Sambrook et 
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aL, eds (1989) Molecular Cloning , Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York, and Examples 3 and 4 herein below. 

In a preferred embodiment, the disclosed methods for identifying 
modulators employ a Heliothis (3FTZ-F1 (SEQ ID NO:18), Heliothis E75 
5 (SEQ ID NO:20), or Heliothis USP (SEQ ID NO:22) polypeptide. The larval 
lethal phenotype that is observed when E75A is overexpressed in Drosophila 
(Figure 5), suggest that agonists of E75A can be useful as insecticides. 
Genetic data in Drosophila shows that loss-of-function mutation in any one of 
USP, J3FTZ-F1, or E75A, also confers larval lethality (Yu et al. (1997) Nature 
10 385:552-555; Johnson & Garza (1998) Ann Dros Res Conf 39:430A), 
suggesting that antagonists of USP, pFTZ-F1, and E75A can also be useful 
as insecticides. 

Representative methods for identification of a substance that binds 
and thereby modulates an insect nuclear receptor are disclosed herein 

15 below. The term "binding" refers to an affinity between two molecules, for 
example, a ligand and a receptor. As used herein, "binding" means a 
preferential binding of one molecule for another in a mixture of molecules. 
The binding of the molecules can be considered specific if the binding affinity 
is about 1 x 10 4 M" 1 to about 1 x 10 6 M" 1 or greater. Binding of two 

20 molecules also encompasses a quality or state of mutual action such that an 
activity of one protein or compound on another protein is inhibitory (in the 
case of an antagonist) or enhancing (in the case of an agonist). To 
demonstrate saturable binding of a candidate compound, identified by any 
such method, to a nuclear receptor ligand binding domain, Scatchard 

25 analysis can be carried out as described, for example, by Mak et al. (1989) J 
Biol Chem 264:21 61 3:21 61 8. 

VIII.A. Protein Binding Assays 

Several techniques can be used to detect interactions between a 
protein and a chemical ligand without employing an in vivo ligand. 
30 Representative methods include, but are not limited to, Fluorescence 
Correlation Spectroscopy, Surface-Enhanced Laser Desorption/lonization 
Time-Of-flight Spectroscopy, and Biacore technology, as described herein 
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below. These methods are amenable to automated, high-throughput 
screening. 

Fluorescence Correlation Spectroscopy (FCS) measures the average 
diffusion rate of a fluorescent molecule within a small sample volume 
5 (Madge et al. (1972) Phys Rev Lett 29:705-708; Maiti et al. (1997) Proc Natl 
Acad Sci USA 94:11753-11757). The sample size can be as low as 10 3 
fluorescent molecules and the sample volume as low as the cytoplasm of a 
single bacterium. The diffusion rate is a function of the mass of the molecule 
and decreases as the mass increases. FCS can therefore be applied to 

10 polypeptide-ligand interaction analysis by measuring the change in mass 
and therefore in diffusion rate of a molecule upon binding. In a typical 
experiment, the target to be analyzed is expressed as a recombinant 
polypeptide with a sequence tag, such as a poly-histidine sequence, inserted 
at the N-terminus or C-terminus. The expression takes place in E. co//, yeast 

15 or mammalian cells. The polypeptide is purified using chromatographic 
methods. For example, the poly-histidine tag can be used to bind the 
expressed polypeptide to a metal chelate column such as Ni 2+ chelated on 
iminodiacetic acid agarose. The polypeptide is then labeled with a 
fluorescent tag such as carboxytetramethylrhodamine or BODIPY™ 

20 (Molecular Probes of Eugene, Oregon). The polypeptide is then exposed in 
solution to the potential ligand, and its diffusion rate is determined by FCS 
using instrumentation available from Carl Zeiss, Inc. (Thornwood, New 
York). Ligand binding is determined by changes in the diffusion rate of the 
polypeptide. 

25 Surface-Enhanced Laser Desorption/lonization (SELDI) was 

developed by Hutchens & Yip (1993) Rapid Commun Mass Spectrom 7:576- 
580. When coupled to a time-of-flight mass spectrometer (TOF), SELDI 
provides a technique to rapidly analyze molecules retained on a chip. It can 
be applied to ligand-protein interaction analysis by covalently binding the 

30 target protein, or portion thereof, on the chip and analyzing by MS the small 
molecules that bind to this protein (Worrall et al. (1998) Anal Biochem 
70:750-756). In a typical experiment, the target to be analyzed is expressed 
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as described for FCS. The purified protein is then used in the assay without 
further preparation. It is bound to the SELDI chip either by utilizing the poly- 
histidine tag or by other interaction such as ion exchange or hydrophobic 
interaction. The chip thus prepared is then exposed to the potential ligand 
5 via, for example, a delivery system able to pipet the ligands in a sequential 
manner (autosampler). The chip is then washed in solutions of increasing 
stringency, for example a series of washes with buffer solutions containing 
an increasing ionic strength. After each wash, the bound material is 
analyzed by submitting the chip to SELDI-TOF. Ligands that specifically 
10 bind the target are identified by the stringency of the wash needed to elute 
them. 

Biacore relies on changes in the refractive index at the surface layer 
upon binding of a ligand to a target polypeptide immobilized on the layer. In 
this system, a collection of small ligands is injected sequentially in a 2-5 

15 microliter cell, wherein the target polypeptide is immobilized within the cell. 
Binding is detected by surface plasmon resonance (SPR) by recording laser 
light refracting from the surface. In general, the refractive index change for a 
given change of mass concentration at the surface layer is practically the 
same for all proteins and peptides, allowing a single method to be applicable 

20 for any protein (Liedberg et al. (1983) Sensors Actuators 4:299-304; 
Malmquist (1993) Nature 361:186-187). In a typical experiment, the target to 
be analyzed is expressed as described for FCS. The purified protein is then 
used in the assay without further preparation. It is bound to the Biacore chip 
either by utilizing the poly-histidine tag or by other interaction such as ion 

25 exchange or hydrophobic interaction. The chip thus prepared is then 
exposed to the potential ligand via the delivery system incorporated in the 
instruments sold by Biacore (Uppsala, Sweden) to pipet the ligands in a 
sequential manner (autosampler). The SPR signal on the chip is recorded 
and changes in the refractive index indicate an interaction between the 

30 immobilized target and the ligand. Analysis of the signal kinetics of on rate 
and off rate allows the discrimination between non-specific and specific 
interaction. 
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VIII. B. Peptide Interaction Assays 

Methods for displaying diverse peptide libraries enable rapid library 
construction, amplification, and selection of ligands directed against a target 
molecule. See Lowman (1997) Annu Rev Biophys Biomol Struct 26:401- 
5 424; Sidhu (2000) Curr Opin Biotech 11(6):610-616; and U.S. Patent No. 
5,510,240. Assays can also be employed that select peptides capable of 
disrupting the interaction between a nuclear receptor and a requisite co- 
factor, as described by Hall et al. (2000) Mol Enodcrinol 14(1 2):201 0-2023; 
Northrop et al. (2000) Mol Endocrinol 14(5)605-622; International Publication 
10 No. WO 00/37077, herein incorporated by reference. 

VII I.C. Transcriptional Assays 

The present invention also provides methods for identifying 
modulators of insect nuclear receptor transcriptional activation. One strategy 
employs an expression system comprising: (1) an insect nuclear receptor 

15 comprising a functional ligand binding domain of an insect nuclear receptor, 
(2) a target gene expression cassette comprising a response element 
regulated by the chimeric nuclear receptor operatively linked to a reporter 
gene, and (3) a test compound. Methods for constructing a chimeric nuclear 
receptor gene and a target gene expression cassette are described herein 

20 below under the heading "Chimeric Receptors for Inducible Gene 
Expression". See also, Wentworth et al. (2000) J Endocrinol 166(3):R1 1-16; 
Yang & Chen (1999) Cancer Res 59(1 8):451 9-4524, and U.S. Patent No. 
4,981,784, herein incorporated by reference. 

The term "reporter gene" refers to a heterologous gene encoding a 

25 product that is readily observed and/or quantitated. A reporter gene is 
heterologous in that it originates from a source foreign to an intended host 
cell or, if from the same source, is modified from its original form. Any 
suitable reporter and detection method can be used in accordance with the 
disclosed methods. Non-limiting examples of detectable reporter genes that 

30 can be operatively linked to a transcriptional regulatory region can be found 
in Alam and Cook (1990) Anal Biochem 188:245-254 and International 
Publication No. WO 97/47763. Preferred reporter genes for transcriptional 
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analyses include the lacZ gene (See e.g., Rose and Botstein (1983) Meth 
Enzymol 101:167-180), Green Fluorescent Protein (GFP) (Cubitt et al. 
(1995) Trends Biochem Sci 20:448-455), luciferase, or chloramphenicol 
acetyl transferase (CAT). 
5 An amount of reporter gene can be assayed by any method for 

qualitatively, or preferably quantitatively, determining presence or activity of 
the reporter gene product. The amount of reporter gene expression directed 
by each test substance is compared to an amount of reporter gene 
expression in the absence of a test substance. A test substance is identified 

10 as having agonist activity when there is significant increase in a level of 
reporter gene expression in the presence of the substance when compared 
to a level of reporter gene expression in the absence of the test substance. 
The term "significant increase", as used herein, refers to an quantified 
change in a measurable quality that is larger than the margin of error 

15 inherent in the measurement technique, preferably an increase by about 2- 
fold or greater relative to a control measurement, more preferably an 
increase by about 5-fold or greater, and most preferably an increase by 
about 10-fold or greater. 

VIII. D. Rational Design 

20 The knowledge of the structure a native nuclear receptor polypeptide 

provides an approach for rational pesticide design. See e.g. Schapira et al. 
(2000) Proc Natl Acad Sci USA 97(3):1008-1013. The structure of a nuclear 
receptor polypeptide can be determined by X-ray crystallography or by 
computational algorithms that generate three-dimensional representations. 

25 See Huang et al. (2000) Pac Symp Biocomput 230-41; Saqi et al. (1999) 
Bioinformatics 15:521-522; International Publication No. WO 99/26966, 
herein incorporated by reference. Alternatively, a working model of a 
nuclear receptor structure can be derived by homology modeling (Maalouf et 
al. (1998) J Biomcl Struc Dynamics 15(5):841-851). Computer models can 

30 further predict binding of a protein structure to various substrate molecules, 
which can be synthesized and tested. Additional compound design 
techniques are described in U.S. Patent Nos. 5,834,228 and 5,872,01 1. 
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IX. Methods for Pest Control 

Another aspect of the present invention is a method for pest control 
by modulation of insect nuclear receptor biological activity. Substances 
having such activity can be discovered by the methods disclosed herein and 
5 include, but are not limited to, chemical compounds, antibodies, and gene 
products encoded by plant transgenes. 

The present invention provides methods for preventing the onset or 
progression of a pest infestation in a plant. The method comprises 
administering a modulator of a nuclear receptor set forth as any one of the 
10 even-numbered SEQ ID NOs:2-34, wherein modulation of the nuclear 
receptor results in organismal lethality. Preferably, the lethality occurs 
during larval development. 

IX.A. Formulation 

An insect nuclear receptor modulator of the present invention is 

15 typically formulated using acceptable vehicles, adjuvants, and carriers as 
desired. Representative formulations include emulsifiable concentrates, 
water-miscible liquids, wettable powders, water-soluble powders, oil 
solutions, flowable powders, aerosols, vapors, granulars, microcapsules, 
fumigants, ultra-low volume concentrates, fogging concentrates, vapors, 

20 impregnating materials, poison baits, and seed dressings. See e.g., Perry et 
al. (1997) Insecticides in Agriculture and Environment: Retrospects and 
Prospects , pp. 7-10, Springer-Verlag, New York, New York. A formulation 
can be further selected based on its ability to improve insecticide properties 
such as storage, handling, application, effectiveness, safety to the applicator 

25 and the environment, and cost. 

An insecticide formulation can further include a synergist that can 
enhance the activity of an insect nuclear receptor modulator of the present 
invention. See Yamamoto (1973) in Casida, ed, Pyrethrum, The Natural 
Insecticide , pp. 191-170, Academic Press, New York, New York; Hodgson & 

30 Tate (1976) in Wilkinson, ed, Insecticide Biochemistry and Physiology , pp. 
115-148, Plenum Press, New York, New York; Wilkinson (1976a) in Tahori, 
ed, Proc 2 nd Int Congr on Pesticides and Chemistry , Vol. 2, pp. 117-159, 
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Gordon & Breach, New York, New York; Wilkinson (1976b) in Metcalf & 
McKelvey, eds, The Future for Insecticides: Needs and Prospects , Vol. 6, 
pp. 191-178, Wiley, New York, New York; Casida & Quistad (1995) in 
Casida & Quistad, eds, Pvrethrum Flowers: Production, Chemistry, 
5 Toxicology, and Uses , pp. 258-276, Oxford University Press, New York, New 
York. Alternatively, synergism can be accomplished by treatment of a plant 
prior to application of an insect nuclear receptor modulator, or by application 
of a synergist at sites on a plant distinct form sites of application of an insect 
nuclear receptor modulator. 

10 IX.B. In Vivo Assays 

The insecticidal activity of a modulator of an insect nuclear receptor 
can be tested using standard techniques in the art, including topical 
application, injection, dipping, contact or residual exposure, and 
feeding/drinking. See e.g., Perry et al. (1997) Insecticides in Agriculture and 

15 Environment: Retrospects and Prospects , pp. 12-13, Springer-Verlag, New 
York, New York. As one example, a formulation comprising a modulator is 
sprayed on a plant, insect larvae are then applied to the plant, and after an 
appropriate temporal duration, a degree of plant destruction by the larvae is 
quantitated. 

20 IX. C. Dose and Administration 

The toxicity of an insecticide to an organism can be expressed in 
terms of the amount of compound per unit weight of the organism required to 
kill 50% of the test population, also referred to as the lethal dose (LD 5 o). The 
LD 50 is usually expressed in milligrams per kilogram of body weight or 

25 micrograms per insect. The lethal concentration (LC 50 ) is the concentration 
of a compound in an external medium that is required to kill 50% of the test 
population, and is expressed as the percentage or parts per million (ppm) of 
the active ingredient (Al) in the medium. This value can be used when the 
exact dose administered to an insect cannot be determined. The 

30 effectiveness of a candidate insecticidal substance can also be assayed in 
terms of lethal time (LC 50 ). LC50 represents the time required to elicit 50% 
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mortality of the test organisms at a specified dose or concentration and is a 
suitable measure for field tests. 

In some instances, a rate of knockdown rather than lethality is 
measured as a criterion of effectiveness. In such cases the knockdown dose 
5 (KD 50 ) or the knockdown time (KT 50 ) can be used to express insecticidal 
activity. 

The present invention also envisions the identification of insecticidal 
substances wherein killing or knockdown does not constitute the desired 
criterion. For example, useful assays can also assess non-lethal measures 

10 such as, for example, progression to developmental stages, fecundity, egg 
viability, attractant or repellant activity, paralysis, and anti-feeding activity. 

Insect nuclear receptor modulators identified in accordance with 
methods of the present invention are useful for preventing or treating an 
insect infestation, and in some cases a nematode infestation, in a plant or 

15 animal. Prevention and treatment methods employ an effective amount of 
the modulator. The term "effective amount" as used herein refers to an 
amount effective to prevent or ameliorate infestation. 

An effective amount can comprise a range of amounts. One skilled in 
the art can readily assess the potency and efficacy of an insect nuclear 

20 receptor modulator of the present invention and adjust the administration 
regimen accordingly. A modulator of insect nuclear receptor biological 
activity can be evaluated by a variety of techniques, for example, by using a 
responsive reporter gene in an transcriptional assay, by assaying interaction 
of insect nuclear receptor polypeptides with a monoclonal antibody, or by 

25 assaying insect viability when a modulator is administered to an insect, each 
technique described herein. One of ordinary skill in the art can tailor the 
dosages to a particular application, taking into account the particular 
formulation and method of administration to be used with the composition as 
well as the type of plant or animal, the development stage of the plant or 

30 animal, and the severity of the infestation to be treated. 

IX. D. Transgenic Plants 
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The present invention also encompasses methods for pest control 
wherein an insect nuclear receptor modulator is expressed in a plant. 
Preferably, a nucleic acid, peptide or polypeptide encoded by a transgene in 
a plant modulates the activity of any of SEQ ID NOs:1-34. In one 
5 embodiment, a transgene can encode a peptide that specifically binds an 
insect nuclear receptor of the present invention. In another embodiment, a 
construct encoding an antibody that specifically binds an insect nuclear 
receptor of the present invention can be expressed in plants to confer insect 
control. See e.g., U.S. Patent No. 5,686,600, the contents of which are 
10 herein fully incorporated by reference. Methods for generating a transgenic 
plant are known in the art and are discussed further herein below. 
IX.E. Target Organisms 

Insect nuclear receptor modulators discovered according to the 
methods disclosed herein can be used for the prevention or amelioration of a 

15 pest infestation. The term "pest" as used herein refers to any organism that 
damages a plant, including mature plants, seedlings, and stored grain. The 
term "pest" also refers to any organism that causes disease in an animal. 
The compositions and methods disclosed herein are envisioned to be 
particularly useful to prevent or to treat infestation of insect pests, including 

20 but not limited to aphids, locusts, spider mites, boll weevils, and pests that 
attack stored grains (e.g., Tribolium and Tenebrio). The present disclosure 
is also relevant to methods for controlling soil nematodes and plant-parasitic 
nematodes such as Melooidogyne. 

X. Chimeric Receptors for Inducible Gene Expression 
25 Transgenic methods have enabled the generation of plants with 

improved traits by expression of a transgene encoding a heterologous 
polypeptide of interest. Ideally, the temporal profile of transgene expression 
can be controlled. 

The present invention envisions a gene switch method that employs 
30 three or more components: (1) a nuclear receptor expression cassette, (2) a 
ligand that binds the polypeptide encoded by the nuclear receptor 
expression cassette, and (3) a target gene expression cassette that is 
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modulated in the presence of the encoded nuclear receptor polypeptide 
further bound by a ligand. 

A nuclear expression receptor cassette of the present invention 
encodes a nuclear receptor polypeptide. In one embodiment, the nuclear 
5 receptor polypeptide is composed of a hinge region, a ligand binding 
domain, a DNA binding domain, and a transactivation domain. The DNA 
binding domain enables binding of the nuclear receptor polypeptide to a 
sequence-specific response element in the 5' regulatory region of a target 
expression cassette. The hinge domain of the receptor polypeptide resides 

10 between the DNA binding and ligand binding domains and influences the 
activity of the ligand binding domain. The ligand binding domain of the 
receptor polypeptide can bind a chemical ligand, thereby eliciting a 
conformational change in the receptor polypeptide that allows the 
transactivation domain to affect transcription of the target nucleotide 

15 sequence. 

A "target expression cassette" comprises a nucleotide sequence for a 
5' regulatory region operatively linked to a target nucleotide sequence, the 
expression of which is activated by a receptor polypeptide in the presence of 
a chemical ligand. The 5' regulatory region of the target gene comprises a 

20 core promoter sequence, an initiation of transcription sequence, and one or 
more sequence-specific response elements required for receptor binding to 
the target gene regulatory region. The promoter sequence can be a minimal 
promoter. The target expression cassette can also possess a 3' termination 
region (stop codon and polyadenylation sequence). The target nucleotide 

25 sequence can encode, for example, a polypeptide, an antisense RNA, or a 
double-stranded RNA molecule. A target sequence can be part of a target 
cassette transformed into a host organism, or it can be a target sequence of 
a native host organism gene. 

The chimeric receptor polypeptides used in the present invention can 

30 have one or more domains obtained from a heterologous source. The use of 
chimeric receptor polypeptides has the benefit of combining domains from 
different sources, thus providing a receptor polypeptide activated by a choice 
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of chemical ligands and possessing desirable ligand binding, DNA binding 
and transactivation characteristics. 

It is also considered a part of the present invention that the 
transactivation (A/B), ligand-binding (E), and DNA-binding (C) domains can 
5 be assembled in the chimeric receptor polypeptide in any functional 
arrangement. For example, where one subdomain of a transactivation 
domain is found at the N-terminal portion of a naturally-occurring receptor, a 
chimeric receptor polypeptide of the present invention can include a 
transactivation domain at the C-terminus in place of, or in addition to, a 

10 transactivation domain at the N-terminus. Chimeric receptor polypeptides as 
disclosed herein can also have multiple domains of the same type, for 
example, more than one transactivation domain per receptor polypeptide. 
X.A. DNA Binding Domain of a Receptor Expression Cassette 
A chimeric receptor of the present invention can comprise a DNA- 

15 binding domain from any one of SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, 
and 26. The term "DNA binding domain" as used in the context of a chimeric 
receptor of the present invention comprises a functional domain that shows 
high affinity sequence-specific DNA binding. A functional DNA binding 
domain will generally include the core DNA binding domain, and optionally, 

20 sequences adjacent to the core DNA binding domain that contribute to high 
affinity specific-specific DNA binding. 

A gene switch receptor cassette preferably encodes a chimeric 
nuclear receptor that modulates gene expression as a monomer or dimer. 
Drosophila pFTZ-FI and the vertebrate homologue of pFTZ-F1, SF-1, can 

25 bind to an extended half-site response element as a monomer and can 
further function as a transcriptional activator in this context (Ueda et al. 
(1992) Mol Cell Biol 12(12):5667-5672; Wilson et al. (1993) Mol Cell Biol 
13(9):5794-5804). Similarly, a vertebrate homologue of E75, RevErb, 
represses transcription on both monomeric and dimeric binding sites 

30 (Harding & Lazar (1995) Mol Cell Biol 15(11):6479; Lazar & Harding (1998) 
in Freedman, ed, Molecular Biology of Steroid and Nuclear Hormone 
Receptors , pp. 269-270, Birkhauser, Boston, Massachusetts). The 
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Drosophila DNA-binding domain and C-terminated extension of the core 
DNA binding domain is highly homologous to RevErb and shows similar 
DNA-binding properties (Segraves & Hogness (1990) Genes Dev 4:204-21 9; 
Lazar & Harding (1998) in Freedman, ed, Molecular Biology of Steroid and 
5 Nuclear Hormone Receptors , pp. 270, Birkhauser, Boston, Massachusetts). 

The Heliothis (3FTZ-F1 and E75A nuclear receptors disclosed herein 
are highly conserved compared to their Drosophila and vertebrate 
homologues (Figures 3 and 4). In particular, the DNA binding domain and 
the C-terminal extension of the core DNA-binding domain of Heliothis PFTZ- 

10 F1 (SEQ ID NO:18) is substantially identical to Drosophila pFTZ-F1. The 
DNA binding domain and the C-terminal extension of the core DNA-binding 
domain of Heliothis E75 (SEQ ID NO:20) is substantially identical compared 
to Drosophila E75A. Given the similar DNA binding properties between 
Drosophila PFTZ-F1 and vertebrate SF-1, and between Drosophila E75A 

15 and vertebrate RevErb, Heliothis pFTZ-F1 and E75 likely function as 
monomeric and/or dimeric transcriptional regulators as well. Thus, the DNA 
binding domains, optionally with the extended C-terminal DNA binding 
sequence, can be useful in gene switch receptor expression cassettes that 
operate as monomeric transcriptional regulators. 

20 The vertebrate nuclear receptor most closely related to DERR, 

estrogen-related receptor, also binds and activates transcription as a 
monomer, again utilizing an extended C-terminal DNA binding sequence 
(Bonnelye et al. (1997) Mol Endocrinol 1 1(7):905-916; Johnston et al. (1997) 
Mol Endocrinol 1 1 (3):342-352). Thus, the DERR DNA binding domain, 

25 optionally also including the C-terminal extended DNA binding sequence, 
can be useful in gene switch receptor expression cassettes. 

Additional flexibility in controlling gene expression by the present 
invention can be obtained by using DNA binding domains and response 
elements from other transcriptional activators, which include but are not 

30 limited to the bacterial LexA or yeast GAL4 proteins (Brent & Ptashne (1985) 
Ce//43: 729-736i; Sadowski et al. (1988) Nature 335:563-564). 
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An additional degree of flexibility in controlling gene expression can 
be obtained by using synthetic DNA binding domains and response 
elements. Protein engineering experiments can rationally alter the DNA 
binding characteristics of zinc finger domains to bind to a DNA target 
5 sequence of choice (Liu et al. (1997) Proc Natl Acad Sci USA 94:5525-5530; 
Desjarlais & Berg (1993) Proc Natl Acad Sci USA 90:2252-1860). For 
example, the use of a synthetic zinc finger binding domain allows the 
chimeric receptor polypeptide to recognize a target sequence of choice. 

A chimeric receptor expression cassette of the present invention 
10 comprises a suitable promoter operatively linked to the coding sequence 
intended for expression. The expression of the nucleotide sequence in the 
expression cassette can be under the control of a constitutive promoter, an 
inducible promoter, or a tissue-specific promoter. Depending upon the host 
cell system utilized, any one of a number of suitable promoters can be used. 
15 Promoter selection can be based on expression profile and expression level. 

X.B. Target Gene Expression Cassette 

A target gene expression cassette includes a response element that 
is operatively linked to a target gene of interest. 

In one embodiment, gene switch systems of the present invention can 

20 employ a chimeric receptor expression polypeptide having a DNA binding 
domain that binds a well-characterized response element, for example, the 
LexA and GAL4 binding sites (Brent & Ptashne (1985) Cell 43: 729-736; 
Sadowski et al. (1988) Nature 335:563-564). 

In another embodiment, a gene switch system of the present 

25 invention can employ a chimeric receptor polypeptide comprising a DNA 
binding domain derived from a novel insect nuclear receptor disclosed herein 
(SEQ ID NOs:2, 6, 10, 14, 18, 20, 22, 24, and 26). Response elements that 
are recognized by DNA binding domains of novel nuclear receptors can be 
determined according to standard methods. Briefly, in vivo footprinting 

30 assays can demonstrate protection of DNA sequences from chemical and 
enzymatic modification within living or permeabilized cells. Similarly, in vitro 
footprinting assays can protect DNA sequences from chemical or enzymatic 
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modification using protein extracts. Nitrocellulose filter-binding assays and 
gel electrophoresis mobility shift assays (EMSAs) can track the presence of 
radiolabeled regulatory DNA elements based on provision of candidate 
transcription factors. Computer analysis programs, for example TFSEARCH 
5 version 1.3 (Yutaka Akiyama: "TFSEARCH: Searching Transcription Factor 
Binding Sites", http://www.rwcp.or.jp/papia/), can be used to locate 
consensus sequences of known cis-regulatory elements within a genomic 
region. 

In a preferred embodiment of the invention, multiple copies of the 
10 appropriate response element are placed in the 5' regulatory region, which 
allows multiple sites for binding of receptor polypeptide resulting in a greater 
degree of activation. 

X.C. Ligand Binding Domain of a Receptor Expression Cassette 
Preferably, the ligand binding domain of a chimeric receptor of the 
15 present invention comprises the "E" domain of any one of SEQ ID NOs:2, 6, 
10, 14, 18, 20, 22, 24, and 26. For the purpose of creating a receptor 
expression cassette, the "E" domain is defined operationally as a sequence 
that is sufficient for ligand binding and can be determined by methods known 
in the art. A ligand binding domain of an insect nuclear receptor disclosed 
20 herein can further be modified to permit and/or optimize binding to a selected 
ligand. Such modifications can include point mutations and truncation. 

X.D. Activation/Repression Domain of the Receptor Expression 
Cassette 

Transactivation (A/B) domains can be defined as amino acid 
25 sequences that, when combined with the DNA binding domain in a receptor 
polypeptide, increase productive transcription initiation by RNA polymerases. 
See, generally, Ptashne (1988) Nature 335:683-689; Meshi (1995) Plant Cell 
Physiol 36:1405-1420. Different transactivation domains are known to have 
different degrees of effectiveness in their abilities to increase transcription 
30 initiation. In the present invention, it is desirable to use transactivation 
domains that have superior transactivating effectiveness in host cells in 
order to create a high level of target expression cassette expression in 



WO 02/077157 



PCT/US02/11257 



-66- 

response to the presence of chemical ligand. Representative transactivation 
domains include but are not limited to herpes simplex virus VP16 
(Triezenberg et al. (1988) Genes Dev 2(6):71 8-729), maize C1 (Goff et al. 
(1 991 ) Genes and Dev 5:298-309), Arabidopsis AP1 , and maize Dof 1 . 
5 As described above, the method of the present invention can be used 

to increase gene expression over a minimal, basal level. One of the 
outstanding benefits of the present method, however, is that it can also be 
used for decreasing or inhibiting gene expression, e.g., gene repression. 
Controlling gene expression through repression can be accomplished using 

10 a repression domain in place of the transactivation domain. Repression 
domains can be defined as amino acid sequences that, when combined with 
the DNA binding domain in a receptor polypeptide, decrease the productive 
transcription initiation by RNA polymerases (Ng (2000) Trends Biochem Sci 
25:121-126). Repression domains that can be used with the present 

15 invention to decrease expression of a target cassette include but are not 
limited to the repression domains of AtHD2A (Wu (2000) Plant J 22:19-27), 
Oshoxl , and Oshox3 (Meijer (2000) Mol Gen Genet 263: 1 2-21 ). 

X.E. Additional Features of Gene Switch Expression Cassettes 
The transgenic expression of genes can also involve modification of 

20 the relevant transgenes to achieve and optimize their expression in a 
selected host. Modifications can include but are not limited to cloning of 
open reading frames normally encoded by a single gene in separate 
expression cassettes, utilization of host codon preferences, adjustment of 
GC/AT content, inclusion of sequences adjacent to the initiating methionine 

25 codon that promote efficient translation, and removal of illegitimate splice 
sites. 

An expression cassette can also comprise any additional sequences 
required or selected for the expression of the transgene. Such sequences 
include, but are not limited to, transcription terminators, introns, sequences 
30 that can enhance gene expression, and sequences that mediate intracellular 
targeting of the gene product. 
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X.F. Liqands for Inducible Gene Expression 

A ligand that activates a chimeric receptor polypeptide and that can 
be used in a gene switch expression system can be identified, for example, 
using methods described herein above under the heading "Identification of 
5 Insect Nuclear Receptor Modulators". 
XI. Transgenic Plants 

The present invention envisions expression of insect nuclear receptor 
modulators and components of nuclear receptor gene switch expression 
systems in plants. Representative techniques for transforming 
10 dicotyledonous and monocotyledonous plants are described herein below. 

The phrase "a plant, or parts thereof, as used herein shall mean an 
entire plant; and shall mean the individual parts thereof, including but not 
limited to seeds, leaves, stems, and roots, as well as plant tissue cultures. 
Transgenic plants of the present invention are understood to encompass not 
15 only the end product of a transformation method, but also transgenic 
progeny thereof. 

Representative plants that can be used in transgenic methods 
disclosed herein include but are not limited to rice, wheat, barley, rye, corn, 
potato, carrot, sweet potato, sugar beet, bean, pea, chicory, lettuce, 

20 cabbage, cauliflower, broccoli, turnip, radish, spinach, asparagus, onion, 
garlic, eggplant, pepper, celery, carrot, squash, pumpkin, zucchini, 
cucumber, apple, pear, quince, melon, plum, cherry, peach, nectarine, 
apricot, strawberry, grape, raspberry, blackberry, pineapple, avocado, 
papaya, mango, banana, tobacco, tomato, sorghum and sugarcane. 

25 XIA. Promoters 

For in vivo production of an insect nuclear receptor modulator or a 
chimeric receptor expression cassette in plants, exemplary constitutive 
promoters are derived from the CaMV 35S, rice actin, and maize ubiquitin 
genes. See Binet et al. (1991) Plant Sci 79:87-94, Christensen et al. (1989) 

30 Plant Mol Biol 12:619-632, Callis et al. (1990) J Biol Chem 265:12486- 
12493, Norris et al. (1993) Plant Mol Biol 21:895-906, European Patent 
Application Nos. 0 342 926 and 0 392 225, Taylor et al (1993) Plant Cell Rep 



WO 02/077157 



PCT/US02/11257 



-68- 

12:491-495, McElroy et al (1990) Plant Cell 2:163-171, McEclroy et al. 
(1991) Mol Gen Genet 231:150-160, Chibbar et al. (1993) Plant Cell Rep 
12:506-509. Representative inducible promoters suitable for use with the 
present invention include the chemically inducible PR-1 promoter, the PR-1a 
5 promoter, an ethanol-inducible promoter, a glucocorticoid inducible 
promoter, and a wound-inducible promoter. See Uknes et al. (1992) Plant 
Cell 4:645-656, Lebel et al. (1998) Plant J 16:223-233, Caddik et al. (1998) 
Nat Biotechnol 16:177-180, Aoyama & Chua (1997) The Plant Journal 
11:605-612, Xu et al. (1993) Plant Mol Biol 22:573-588; Logemann et al. 

10 (1989) Plant Cell 1:151-158, Rohrmeier & Lehle (1993) Plant Mol Biol 
22:783-792, Firek et al. (1993) Plant Mol Biol 22:129-142, and Warner et al. 
(1993) Plant J 3:191-201. Selected promoters can direct expression in 
specific cell types (such as leaf epidermal cells, mesophyll cells, root cortex 
cells) or in specific tissues or organs (roots, leaves or flowers, for example). 

15 Representative promoters that direct cell- or tissue-specific expression in 
plants and can be used in accordance with the present invention include but 
are not limited to a root-specific promoter (de Framond (1991) FEBS 
290:103-106, U.S. Patent No. 5,466,785), a pith-preferred promoter 
(International Publication No. WO 93/07278), a leaf-specific promoter 

20 (Hudspeth & Grula (1989) Plant Mol Biol 12:579-589), and a pollen-specific 
promoter (International Publication No. WO 93/07278). 
XI.B. Vectors 

The expression cassette is cloned into a vector suitable for 
transformation. Suitable expression vectors which can be used include, but 

25 are not limited to, the following vectors or their derivatives: plant 
transformation vectors, viruses such as vaccinia virus or adenovirus, 
baculovirus vectors, yeast vectors, bacteriophage vectors (e.g., lambda 
phage), plasmid and cosmid DNA vectors, and transposon-mediated 
transformation vectors. 

30 Numerous vectors available for plant transformation are known to 

those of ordinary skill in the plant transformation arts, and the genes 
pertinent to this invention can be used with any such vectors. Exemplary 
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vectors include pCIB200, pCIB2001, pCIBIO, pCIB3064, pSOG19, and 
pSOG35. The selection of vector will depend upon the preferred 
transformation technique and the target species for transformation. 

Many vectors are available for transformation using Agrobacterium 
5 tumefaciens. These typically carry at least one T-DNA border sequence and 
include vectors such as pBIN19 (Bevan (1984) Nuc Acids Res 12:8711- 
8721) and pXYZ. See also European Patent Application No. 0 332 104, 
herein incorporated by reference. 

Transformation without the use of Agrobacterium tumefaciens 

10 circumvents the requirement for T-DNA sequences in the chosen 
transformation vector and consequently vectors lacking these sequences 
can be utilized in addition to vectors such as the ones described above 
which contain T-DNA sequences. Transformation techniques that do not 
rely on Agrobacterium include transformation via particle bombardment, 

15 protoplast uptake (e.g., electroporation), and microinjection. The choice of 
vector depends largely on the preferred selection for the plant species being 
transformed. 

For certain target species, different antibiotic or herbicide selection 
markers can be preferred. Selection markers used routinely in 

20 transformation include the nptll gene, which confers resistance to kanamycin 
and related antibiotics (Messing & Vierra (1982) Gene 19: 259-268; Bevan et 
al. (1983) Nature 304:184-187), the bar gene, which confers resistance to 
the herbicide phosphinothricin (White et al. (1990) Nuc Acids Res 18:1062, 
Spencer et al. (1990) Theor Appl Genet 79:625-631), the hph gene, which 

25 confers resistance to the antibiotic hygromycin (Blochlinger & Diggelmann 
(1984) Mol Cell Biol 4:2929-2931), and the dhfr gene, which confers 
resistance to methatrexate (Bourouis et al. (1983) EMBO J 2(7): 1 099-1 104), 
the EPSPS gene, which confers resistance to glyphosate (U.S. Patent Nos. 
4,940,935 and 5,188,642), and the mannose-6-phosphate isomerase gene, 

30 which provides the ability to metabolize mannose (U.S. Patent Nos. 
5,767,378 and 5,994,629). 
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XI. C. Transformation of Dicotyledons 

Transformation techniques for dicotyledons are known in the art and 
include Agrobacterium-based techniques and techniques that do not require 
Agrobactehum. Non-Agrobacterium techniques involve the uptake of 
5 exogenous genetic material directly by protoplasts or cells. This can be 
accomplished by polyethylene glycol (PEG) electroporation, particle 
bombardment-mediated uptake, or microinjection. Examples of these 
techniques are described by Paszkowski et al. (1984) EMBO J 3:2717-2722; 
Potrykus et al. (1985) Mol Gen Genet 199:169-177; Reich et al. (1986) 

10 Biotechnology 4:1001-1004; Klein et al. (1987) Nature 327:70-73; and U.S. 
Patent Nos. 4,945,050, 5,036,006, and 5,100,792. Using any of the afore- 
mentioned methods, the transformed cells can be regenerated to whole 
plants using standard techniques known in the art. 
XI.D. Transformation of Monocotyledons 

15 Transformation of most monocotyledon species has now also become 

routine. Preferred techniques include direct gene transfer into protoplasts 
using PEG or electroporation techniques, and particle bombardment into 
callus tissue. See European Patent Application Nos. 0 292 435, 0 392 225, 
and 0 332 581; International Publication Nos. WO 93/07278 and WO 

20 93/21335; Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Fromm et al. 
(1990) Biotechnology 8:833-839; Koziel et al. (1993) Biotechnology 
1 1:194-200; Zhang et al. (1988) Plant Cell Rep 7:379-384; Shimamoto et al. 
(1989) Nature 338:274-277; Datta et al. (1990) Biotechnology 8:736-740; 
Christou et al. (1991) Biotechnology 9:957-962; Vasil et al. (1992) 

25 Biotechnology 10:667-674; Vasil et al. (1993) Biotechnology 11:1553-1558; 
and Weeks et al. (1993) Plant Physiol 102:1077-1084. More recently, 
transformation of monocotyledons using Agrobactehum has been described. 
See International Publication No. WO 94/00977 and U.S. Patent No. 
5,591,616, both of which are incorporated herein by reference. 

30 XM. Methods of Inducible Gene Expression 

The present invention further provides a method of controlling gene 
expression in an organism, the method comprising: (a) transforming the 
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organism with a receptor expression cassette comprising a 5' regulatory 
region capable of promoting expression operatively linked to a receptor 
cassette encoding a chimeric receptor polypeptide of the invention, and a 3' 
terminating region; (b) transforming the organism with a target expression 
5 cassette comprising a 5' regulatory region operatively linked to a target 
nucleotide sequence, wherein the 5' regulatory region comprises one or 
more response elements that are recognized by the DNA binding domain of 
the chimeric receptor polypeptide; (c) expressing the chimeric receptor 
polypeptide in the organism; and (d) contacting the organism with a 

10 chemical ligand that binds to the ligand binding domain of the chimeric 
receptor polypeptide, whereby the chimeric receptor polypeptide activates 
expression of the target nucleotide sequence. 

Methods employing a gene switch system as disclosed herein are 
useful for regulated expression in any organism that can express a nuclear 

15 receptor expression cassette, including both plants and animals. 

In a preferred embodiment, nuclear receptor cassettes comprising 
disclosed nuclear receptor sequences are useful for the regulation of 
expression of target polypeptides in plants in the presence of appropriate 
chemical ligands. For example, U.S. Patent No. 5,880,333 is drawn to a 

20 method for controlling gene expression in plants comprising transforming a 
plant with expression cassette encoding a nuclear receptor polypeptide and 
a target sequence. The method is useful for controlling various traits of 
agronomic importance. 

The gene switch system disclosed herein can also be adopted to 

25 gene therapy methods. The insect nuclear receptor components of a 
receptor cassette disclosed herein are particularly relevant to regulated 
expression in mammals based on their heterologous derivation. Utilization 
of insect nuclear receptor ligand binding domains, and ligands that 
specifically activate such nuclear receptors, will minimize the possibility of 

30 cross-reactivity with endogenous mammalian receptors. 
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Examples 

The following Examples have been included to illustrate modes of the 
invention. Certain aspects of the following Examples are described in terms 
of techniques and procedures found or contemplated by the present co- 
5 inventors to work well in the practice of the invention. These Examples 
illustrate standard laboratory practices of the co-inventors. In light of the 
present disclosure and the general level of skill in the art, those of skill will 
appreciate that the following Examples are intended to be exemplary only 
and that numerous changes, modifications, and alterations can be employed 
10 without departing from the scope of the invention. 

Example 1 

Database Searches 

A pileup alignment of Drosophila and C. elegans nuclear receptor 
DNA-binding domains was generated using the GCG program (Devereux et 

15 al. (1984) Nuc Acids Res 12:387-395). The pileup was used to build a 
profile Hidden Markov Model (HMM) according to the HMMER 2.1.1 program 
(available from Washington University School of Medicine, St. Louis, 
Missouri). HMMER 2.1.1 hmmbuild parameters were selected for maximal 
sensitivity for identifying complete fragments and excluding local alignments. 

20 The profile HMM was further calibrated according to the program 
instructions. A database (referred to herein as "the GeneMark database") 
was assembled by predicting proteins using the GeneMark program 
(Borodovsky & Mclninch (1993) Computers & Chemistry 17:123-133) based 
on 50 kb Drosophila genomic sequence scaffolds that had been generated 

25 at Celera Genomics, Inc. (Rockville, Maryland). The profile HMM was also 
used to search a database generated by Celera using alternative protein 
prediction programs (referred to herein as the "Celera predicted protein 
database"). 
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Example 2 

Isolation of Drosophila melanogaster Nuclear Receptor cDNAs 

cDNA clones of Drosophila nuclear receptors were cloned by PCR 
using a first strand Drosophila cDNA pool as template. PCR primers were 
5 designed to include the predicted start and stop codons of each receptor 
using the primer3 application (available through Whitehead/MIT Center for 
Genome Research of Cambridge, Massachusetts). Amplified products were 
cloned in the pUNIA/5-His-TOPO vector (Invitrogen Corporation of Carlsbad, 
California). Cloned inserts were sequences on both strands by primer 
10 walking using an ABI PRISM® 3700 DNA Analyzer (Applied Biosystems of 
Foster City, California) to an accuracy of <1/1 0,000 nucleotide errors. 

Example 3 

Cloning Heliothis virescens Nuclear Receptors by Library Screening 

A cDNA library was constructed in the Uni-ZAP XR vector Stratagene 
15 of La Jolla, California) using oligo-dT-primed transcripts from Heliothis 
virescens embryos aged 0-24 hours at 25°C. Library clones were excised 
from Uni-ZAP XR as pBluescript phagemids. The library was transformed 
into E.coli by electroporation and plated into 384-well plates with an average 
calculated density of 4-6 colonies per well. Plates were arrayed by a Q-Bot 
20 (Genetix Pharmaceuticals, Inc. of Cambridge, Massachusetts) onto a 22 x 
22 cm nylon filter. The filter was probed under low stringency conditions. 
Representative low-stringency conditions are overnight hybridization at 55°C 
in Church's Buffer followed by washing in 2X SSC 0.1% SDS, twice at room 
temperature and once at 42°C, each wash approximately 30 min). The 
25 probe was a mixture of random primed radioactively labeled DNA binding 
domains (DBDs) of Drosophila melanogaster nuclear receptors EcR, usp, 
DHR38, DHR39, /3FTZ-F1 and £75. The DBD fragments were generated by 
PCR using nondegenerate primers corresponding to the indicated DBDs. 

For secondary screening, E.coli from positive wells were spread onto 
30 agar plates, 96 colonies corresponding to each primary screen well were 
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picked into 384-well plates. The colonies were arrayed onto a nylon 
membrane and re-probed using identical conditions. Positive clones were 
sequenced. 

Example 4 

5 Cloning Heliothis virescens Nuclear Receptors by PCR 

polyA + RNA was made from Heliothis virescens embryos aged 0-24 
hours at 25°C. cDNA was prepared by reverse transcription using random 
primers and a Gene Amp kit (Perkin Elmer of San Jose, California). The 
reverse transcription reaction was allowed to proceed for 15 min at 42°C, 

10 and the reaction was stopped by incubating the reaction for 5 min at 99°C. 
For amplification of nuclear receptors, nested, degenerate primers designed 
according to the Drosophila 0FTZ-F1 sequence (SEQ ID NOs:35-38). 
Primers included restriction enzyme sites to facilitate cloning into pBluescript 
(Stratagene of La Jolla, California). Cycling parameters for amplification 

15 using degenerate primers were as follows: initial amplification - 2 min at 
95°C; 35 cycles - 15 sec at 95°C, 30 sec at 60°C; 7 min at 72°C; hold at 
4°C; second amplification 2 min at 95°C; 35 cycles --15 sec at 95°C, 30 
sec at 60°C, 2.5 min at 72°C; 7 min at 72°C; hold at 4°C. 

Amplification of cDNA ends was performed using a MAFRATHON 
20 RACE kit (Clontech Laboratories, Inc. of Palo Alto, California). Gene- 
specific primers (SEQ ID NOs:39-42) were designed according to Heliothis 
virescens nuclear receptor sequences obtained as described herein above. 
cDNA libraries generated from Heliothis virescens adult head and/or larval 
gut were used as template. RACE products were cloned into the TOPO-A 
25 vector (Invitrogen Corporation of Carlsbad, California). 

Example 5 

Double-Stranded RNA Interference 

Preparation of dsRNA for Injection. Sequences to be expressed as 
dsRNA were cloned into Bluescript KS(+) (Stratagene of La Jolla, California), 
30 linearized with the appropriate restriction enzymes, and transcribed in vitro 
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with the Ambion T3 and T7 MEGASCRIPT® high yield transcription kits 
following the manufacturer's instructions (Ambion Inc. of Austin, Texas). 
Transcripts were annealed in injection buffer (0.1 mM NaP0 4 pH 7.8, 5mM 
KCI) after heating to 85°C and cooling to room temperature over a 1- to 24- 
5 hr period. All annealed transcripts were analyzed on agarose gels with DNA 
markers to confirm the size of the annealed RNA and quantitated as 
described previously (Fire et al. (1998) Nature 391(6669):806-811). Injected 
RNA was not gel-purified. Injection of 0.1 nl of a 0.1- to 1.0-mg/ml solution of 
a 1-kb dsRNA corresponds to roughly 10 7 molecules/injection. 

1 0 Injection of Drosophila melanogaster Embryos. Fly cages were set up 

using 2- to 4-day flies. Agar-grape juice plates were replaced every hour to 
synchronize the egg collection for 1-2 days. The eggs were collected over a 
30- to 60-min period for subsequent injection. The eggs were washed into a 
nylon mesh basket with tap water. The chorion was removed by brief 

15 soaking in a dilute bleach solution. Eggs were positioned on a glass slide 
such that each egg was in a same orientation. Double-stranded RNA was 
injected into middle of each egg using an Eppendorf transjector (Eppendorf 
Scientific, Inc. of Westbury, New York). Following injection, slides were 
stored in a moist chamber to prevent dessication of the embryos. Embryos 

20 were monitored for development and transferred as first instar larvae to vials 
containing Drosophila medium. Methods for rearing Drosophila staging and 
common genetic techniques can be found, for example, in Roberts (1986) 
Drosophila melanooasten A Practical Approach , IRL Press, Washington, 
DC; Ashburner (1989a) Drosophila: A Laboratory Handbook . Cold Spring 

25 Harbor Laboratory Press, New York, New York; Ashburner (1989b) 
Drosophila: A Laboratory Manual , Cold Spring Harbor Laboratory Press, 
New York, New York; Goldstein & Fyrberg, eds (1994) in Methods in Cell 
Biology , Vol. 44, Academic Press, San Diego, California. 
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Example 6 

Overexpression of Nuclear Receptors in Drosoohila melanogaster 

The following Drosophila melanogaster lines were used for over- 
expression analysis: 

5 w 1118 

w 1118 \ +/SM5; P[hs-E75A w + ]/TM3 
w 1118 ; P[hs-E75A-H w + ]/CyO; Dr/TM3 
w 1118 \ P[hs-DHR39-6 w 4 ] 
W 1118. p[h S -DHR39-3 w + ] 
10 yw, P[hs-DHR38 w + ]-\\ 

All lines were acquired from the public Drosophila melanogaster Stock 
Center of Bloomington, Indiana ( http://flystocks.bio.indiana.edu ) or from Dr. 
Carl Thummel of Howard Hughes Medical Institute, Salt Lake City, Utah. 
Drosophila melanogaster genotypes are indicated according to standard 
15 nomenclature in the field. See Lindsley & Zimm (1992) The Genome of 
Drosophila melanogaster , Academic Press, San Diego, California. 

Embryos were collected on grape juice/agar plates at 25°C. First 
instar larvae were staged at embryo hatching. Collections of first instar 
larvae aged 0-2 hours +/- 15 minutes post-hatching were allowed to develop 
20 to the desired stage at 25°C. 

Heat treatments were performed by placing staged larvae on grape 
juice/agar plates in a 37°C warm room for 2 hours. Larvae were then 
transferred to vials containing Drosophila medium and returned to 25°C for 
post-heat treatment recovery. The duration of heat treatment was 
25 determined empirically to result in minimal lethality of control lines. Viability 
was scored at 24 hours post heat treatment by observation of larval 
movement and/or at four days post heat treatment by counting the number of 
pupae. All heat treatment experiments were performed using collections of 
50 larvae per genotype. 
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Example 7 
Recombinant Production of in E. coli 

A cDNA clone of the present invention is subcloned into an 
appropriate expression vector and transformed into E. coli using the 
5 manufacturer's conditions. Specific examples include plasmids such as 
pBluescript (Stratagene of La Jolla, California), pFLAG (International 
Biotechnologies, Inc. of New Haven, Connecticut), and pTrcHis (Invitrogen 
Corporation of Carlsbad, California). E. coli are cultured, expression of the 
recombinant protein is confirmed, and recombinant protein is isolated using 
1 0 standard techniques. 

Example 8 

Recombinant Production of a Nuclear Receptor in Insect Cells 

Baculovirus vectors, which are derived from the genome of AcNPV 
virus, are designed to provide high levels of expression of cDNA in the 

15 Spodoptera frugiperda (SF9) line of insect cells (ATCC CRL# 1711). 
Recombinant baculovirus expressing the cDNA of the present invention is 
produced by the following standard methods (Invitrogen MaxBac Manual, 
Invitrogen Corporation of Carlsbad, California): cDNA constructs are ligated 
into the polyhedhn gene in any one of a variety of baculovirus transfer 

20 vectors, including the pAC360 and the BleBAc vector (Invitrogen Corporation 
of Carlsbad, California). Recombinant baculoviruses are generated by 
homologous recombination following co-transfection of the baculovirus 
transfer vector and linearized AcNPV genomic DNA (Kitts (1990) Nucleic 
Acid Res 18:5667) into SF9 cells. Recombinant pAC360 viruses are 

25 identified by the absence of inclusion bodies in infected cells and 
recombinant pBlueBac viruses are identified on the basis of p-galactosidase 
expression (Summers & Smith, Texas Agriculture Exp Station Bulletin No. 
1555). 

A cDNA encoding an entire open reading frame for gene is inserted 
30 into the BamH I site of pBlueBacll (Invitrogen Corporation of Carlsbad, 
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California). Constructs in the positive orientation, identified by sequence 
analysis, are used to transfect SF9 cells in the presence of linear AcNPV 
wild type DNA. The recombinant insect nuclear receptor is present in the 
cytoplasm of infected cells. The recombinant insect nuclear receptor is 
5 extracted from infected cells by hypotonic or detergent lysis. 

Example 9 

In vitro Binding Assays 

Recombinant protein can be obtained, for example, according to the 
approach described in Example 7 or 8 herein above. The protein is 

10 immobilized on chips appropriate for ligand binding assays. The protein 
immobilized on the chip is exposed to a candidate substance according to 
methods known in the art. While the sample compound is in contact with the 
immobilized protein, measurements capable of detecting protein-ligand 
interactions are conducted. Measurement techniques include, but are not 

15 limited to, SEDLI, Biacore, and FCS, as described above. Substances that 
bind the protein are readily discovered using this approach and are 
subjected to further characterization. 
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changed without departing from the scope of the invention. Furthermore, the 
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CLAIMS 

What is claimed is: 

1 . An isolated insect nuclear receptor polypeptide comprising: 

(a) a polypeptide encoded by the nucleotide sequence of 
5 anyone of SEQ ID NOs:1, 5, 9, 17, 19, 21, 23, and 25; 

(b) a polypeptide encoded by a nucleic acid molecule that is 
substantially identical to any one of SEQ ID NOs:1, 5, 9, 
17, 19, 21, 23, and 25; 

(c) a polypeptide comprising the amino acid sequence of 
10 any one of SEQ ID NOs:2, 6, 10, 18, 20, 22, 24, and 26; 

(d) a polypeptide that is a biological equivalent of the 
polypeptide of any one of SEQ ID NOs:2, 6, 10, 18, 20, 
22, 24, and 26; or 

(e) a polypeptide which is immunologically cross-reactive 
15 with an antibody that shows specific binding with a 

polypeptide of any one of SEQ ID NOs:2, 6, 10, 18, 20, 
22, 24, and 26. 

2. An isolated nucleic acid molecule encoding an insect nuclear 
receptor polypeptide comprising: 

20 (a) a nucleotide sequence of any one of SEQ ID NOs:1, 5, 

9, 17, 19, 21, 23, and 25; or 

(b) a nucleic acid molecule substantially identical to any one 
of SEQ ID NOs:1, 5, 9, 17, 19, 21, 23, and 25. 

3. A chimeric gene, comprising the nucleic acid molecule of claim 
25 2 operatively linked to a heterologous promoter. 

4. A vector comprising the chimeric gene of claim 3. 

5. A host cell comprising the chimeric gene of claim 3. 

6. The host cell of claim 5, wherein the cell is selected from the 
group consisting of a bacterial cell, an insect cell, and a plant cell. 
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7. A method of detecting a nucleic acid molecule that encodes an 
insect nuclear receptor polypeptide, the method comprising: 

(a) procuring a biological sample comprising nucleic acid 
material; 

5 (b) hybridizing the nucleic acid molecule of claim 2 under 

stringent hybridization conditions to the biological 
sample of (a), thereby forming a duplex structure 
between the nucleic acid of claim 2 and a nucleic acid 
within the biological sample; and 

10 (c) detecting the duplex structure of (b), whereby a nuclear 

receptor nucleic acid molecule is detected. 

8. An antibody that specifically recognizes an insect nuclear 
receptor polypeptide of claim 1 . 

9. A method for producing an antibody that specifically recognizes 
15 an insect nuclear receptor polypeptide, the method comprising: 

(a) recombinantly or synthetically producing an insect 
nuclear receptor polypeptide, or portion thereof, as set 
forth in any of SEQ ID NOs:2, 6, 10, 18, 20, 22, 24, and 
26; 

20 (b) formulating the polypeptide of (a) whereby it is an 

effective immunogen; 

(c) administering to an animal the formulation of (b) to 
generate an immune response in the animal comprising 
production of antibodies, wherein antibodies are present 

25 in the blood serum of the animal; and 

(d) collecting the blood serum from the animal of (c), the 
blood serum comprising antibodies that specifically 
recognize a nuclear receptor polypeptide. 
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10. A method for detecting a level of nuclear receptor polypeptide, 
the method comprising 

(a) obtaining a biological sample comprising peptidic 
material; and 

5 (b) detecting a nuclear receptor polypeptide in the biological 

sample of (a) by immunochemical reaction with the 
antibody of claim 8, whereby a level of nuclear receptor 
polypeptide in a sample is determined. 

11. A method for identifying a substance that modulates nuclear 
1 0 receptor function, the method comprising: 

(a) isolating an insect nuclear receptor polypeptide of claim 
1; 

(b) exposing the isolated insect nuclear receptor 
polypeptide to a plurality of candidate substances; 

15^ (c) assaying binding of a candidate substance to the 

isolated nuclear receptor polypeptide; and 

(d) selecting a candidate substance that demonstrates 
specific binding to the isolated insect nuclear receptor 
polypeptide. 

20 12. A method for identifying an insecticidal substance that 

modulates nuclear receptor function, the method comprising: 

(a) isolating an insect nuclear receptor polypeptide of any 
one of even numbered SEQ ID NOs:2-34, wherein 
modulation of the insect nuclear receptor polypeptide 

25 confers lethality of an insect during a larval stage; 

(b) exposing the isolated insect nuclear receptor 
polypeptide to a plurality of substances; 

(c) assaying binding of a substance to the isolated nuclear 
receptor polypeptide; and 
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(d) selecting a substance that demonstrates specific binding 
to the isolated insect nuclear receptor polypeptide. 

13. A method for preventing or abrogating an insect infestation of a 
plant, the method comprising: 

5 (a) preparing an insecticidal composition that includes an 

insect nuclear receptor modulator identified according to 
the method of claim 12; and 

(b) contacting an effective dose of the insecticidal 
composition with a plant, whereby an insect infestation 
10 of a plant is prevented or abrogated. 

14. The method of claim 13, wherein the insecticidal composition 
comprises a chemical compound, a protein, a peptide, a nucleic acid, or an 
antibody. 

15. A method for preventing or abrogating a nematode infestation 
15 of a plant, the method comprising: 

(a) preparing an insecticidal composition that includes an 
insect nuclear receptor modulator identified according to 
the method of claim 12; and 

(b) contacting an effective dose of the insecticidal 
20 composition with a plant, whereby a nematode 

infestation of a plant is prevented or abrogated. 

16. A method for preventing or abrogating an insect infestation of a 
plant, the method comprising expressing in a plant an insect nuclear 
receptor modulator that modulates the activity of an insect nuclear receptor 

25 polypeptide of claim 1, whereby an insect infestation of the plant is 
prevented or abrogated. 

17. The method of claim 16, wherein the bioactive agent comprises 
a protein, a peptide, a nucleic acid, or an antibody. 
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18. A method for preventing or abrogating a nematode infestation 
of a plant, the method comprising expressing in a plant a bioactive agent 
that modulates the activity of an insect nuclear receptor polypeptide of claim 
1 , whereby a nematode infestation of the plant is prevented or abrogated. 

5 19. A chimeric nuclear receptor cassette comprising a DNA binding 

domain, a ligand binding domain, and an activation or repression domain, 
wherein one or more of the DNA binding domain, the ligand binding domain, 
and the activation domain comprises an amino acid sequence that is 
identical or substantially identical to a portion of any one of SEQ ID NOs:2, 
10 6, 10, 18, 20, 22, 24, and 26. 

20. A method of inducing expression of a target nucleotide 
sequence, the method comprising: 

(a) constructing a chimeric nuclear receptor expression 
cassette of claim 1 9; and 

15 (b) constructing a target expression cassette having a 

target nucleotide sequences and a cis-regulatory 
element that is recognized by a DNA-binding domain of 
the chimeric nuclear receptor expression cassette; 

(c) expressing the chimeric nuclear receptor expression 
20 cassette and the target expression cassette in a 

heterologous organism; and 

(d) contacting a ligand that binds to the ligand binding 
domain of the chimeric nuclear receptor expression 
cassette with the organism, whereby the target 

25 nucleotide sequence is expressed. 

21 . The method of claim 20, wherein the heterologous organism is 
a plant. 
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GAYLPTAGTVC DQTDTKDVIEELCPVCGDKVSGYHYGLLTCESCKGFFKRTVQN 

GPGNPMGGTSATPGHGGEVIDFKHLFEELCPVCGDKVSGYHYGLLTCESCKGFFKRTVQM 



NKKVYTCVAERACHIDKTQRKRCPFCRFQKCLDVGMKLEAVRADRMRGGRNKFGPMYKRD 
KKVTTCVAERACHIDKTQRKRCPFCRFQKCLDVGMKLEAV11ADRMRGGRNKFGPMYKRDR 
KKVYTCVAERSCHIDKTQRKRCPYCRFQKCLEVGM KLEAVRADRMRGGRNKFG PMYKRDR 

RARKLQMMRQRQIAVQTLRGSLG. . . . DSGLVLGFGSPYATVPVKQEIQiPQVSSLTSSP 
ARKLQMMRQRQIAVQTLRGSLG . . . . DGGLVLGFGSPYTAVSVKQEIQIPQVSSLTSSPE 
ARKLQVMRQRQLALQALRNSMGPDIKPTPISPGYQQAYPNMNIKQEIQIPQVSSLTQSPD 

ESSPGPALLA AQPQPP QPPP 

SSPGPALLR AQPQPP QPPP 

SSPSPIAIALGQVNASTGGVIATPMNAGTGGSGGGGLNGPSSVGNGNSSNGSSNGNNNSS 

PPAHDKW EAHSPHSASPDAFAFDAPA. . . . TAA 

PPTHDKW EAHSPHSASPDAFTFDTQS . . . .NTAA 

TGNGTSGGGGGNNAGGGGGGTNSNDGLHRNGGNDSSSCHEAGIGSLQNTADSKLCFDSGT 

ATPSSTAEPTSTETLRVS PMIREFVQTIDDREWQNSLFGLLQSQTYNQCEVDLFE . LMCK 
TPSSTAEATSTETLRVSPMIREFVQTVDDREWQNALFGLLQSQTYNQCEVDLFE . LMCKV 
HPS ST AD AL . I E PLRVS PM I REF VQ S I DDREWQTQ L F AL LQKQT YNQ VE VD L F E L LMC KV 

VLDQNLFSQVDWARKTVFFKYLKVX>DQMKLLQHSWSDMLVLDHLHQRMHNGLPDETTLHN 
LDQNLFSQVDWARNTVFFKYLKVDDQMKLLQDSWSVMLVLDHLHQRMHNGLPDETTLHNG 
LDQNLFSQVDWARNTVFFKDLKVDDQMKLLQHSWSDMLVLDHLHHRIHNGLPDETQLNNG 
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hv . bf tz GQKFDLLCLGLLGVPSLADHFNELQNKLAELKFDVPDYICVKFLLLLNPEVRGIVNVKCV 

bmf tz QKFDLLCLGLLGVPSLADHFNELQNKLAELKFDVPDYICVKFMLLLNPEVRGIVNVKCVR 

dmf tz QVFNLMSLGLLGVPQPGDYFNELQNKLQDLKFDMGDYVCMKFLILLNPSVRGIVNRKTVS 

hv.bf tz RDGYQTVQAALLDYTLSCYPTIQDKFDKLVMWPEIHALAARGEEHLYQRHCAGQAPTQT 

bmf t z EGYQTVQAALLDYPY . LLSTIQDKFGKLVMWPEIHALRL . GEKST .... CTSGIVQARH 

dmf tz EGHDNVQAALLDYTLTCYPSVNDKFRGLVNILPEIHAMAVRGEDHLIT . . CTPSTVPAVR 

hv . bf t z LLMEMLHAKRKS 

bmf tz LPRLFSWKCCTQNANLEVPVTNKVEELRSAKPRRHHNK 

dmf tz PPKRCSWRCCTPSARDRGRENVTRNT 
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D - me lanoga ster (A) 
M . sexta (A) 

D . me lanoga ster (B) ~ ~ ~ ------ - ~ ~ ------- - - - - - - ~ - - - - - - - - - - -- -- -- -- - - 

M. sexta (B) : 

D . melanogaster (C) MHGGGPGSSGSNIIRRSSGSFPGSGSGSASKIjIKTEPIDFEMIiHLEENERQQDIEREPSS 

C . f umi f erana 

G . me llonel la 
M . ens i s 

H . virescens ~ 

D . me 1 anoga s ter.(A) '~- ------ - - - - - ~ - - - - - - ~ - - - - - - - " . ' ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ " 

M . sexta (A) ■ 

D .melanogaster (B) MVCAMQEVAAVQHQQQQQQLQLPQQQQQQQQTTQQQHATTIVLLTGNGGGNLHIVA 

M. sexta (B) 

D .melanogaster (C) SNSNSNSNSLTPQRYTHVQVQTVPPRQPTGLTTPGGTQKVILTPRVEYVQQRATSSTGGG 

C . f umi f erana -------------- 

G. mell one 11a ~~ ~ " ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ " ~ ~ " ~ ~ ~ ~ ~ ~ I ~ „ I ~ I 1 1 
M . ensis 

H . vire scens 

D .melanogaster (A) """"I"""""!"! Ill 1 1 

juj b sexta (A) — ~ ~ — — 

D ! melanogaster (B) TPQQHQPMHQLHHQHQHQHQHQQQAKSQ^ 

M. sexta (B) -------------------------------------------------------- 

D. melanogaster (C) MKHVYSQQQGTAASRSAPPETT^LTTTSGT^ 

C . f umi f erana ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " " " ""' ~ ~ *** " " ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ *' I ~ ~ I ' I _ _ I ^ _ 

G. mellonella ~ - ~ ~ ~ ~~~ ~~ — — — — ~ ^ 

M. ensis 

H . virescens ~ - - - - - - - - -------- - - - - - - - - - - - -- -- -- ---- - - - - - -- -- -- -- 

D . melanogaster (A) mlmsadssdsaktsvicstvsas 

^ sexta (A) ~ ** "** ~* " "** ""* ™ *" ~ " " ~ ™ 

D ! melanogaster (B) QQQQQPQRKRLKNEAAIVQQQQQTPATLVKTTTT^ 

g^^^^ ( B ) " *** "* ~ ' * "" * 

D 1 melanogaster (C). QQQQQPQRQQSPPPLHHQQQQQQQHVRVIRDGRLYDEAT^ 

C. fumif erana ------------------- ~~~~ZZ~~~ZZZZZZZIZZZZZZZZ 

G . me 1 lonel 1 a ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ' ~ ~ ~ v ^ 

M . ensis ~~ ~ " " " " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ^ ^ ^ 

H . vi r es cens 

D . melanogaster (A) MLAPPAPEQPSTTAPPILGVTGRSHLEN^I^PPNTSVSAYYQH^ 
M . s ext a ( A) 

D. melanogaster (B) HQQPAAAATPKPCADLSAKNDSESGIDEDCPNSDEDCPNANPAGTSIiEDSSYEQYQCPWK 

M • sexta (B) ~* ** — — — — 

D^melanogaster (C) APVSPVIARRGGAAAYMDQQYQQRQTPP^ 

C . f umi f erana ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ " ~ ~ ^ 1 1 1 1 1 1 1 1 1 IIIIIIIIIII ! 

G. mellonella ~ ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ *" " " " ^ " IIIIIIIIII . 

M. ensis ~ " ~ ~ " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~ ; 

H. virescens 

D. melanogaster (A) LVAPVTDLDTVPPTGVTMASSSN^ 
M . s ex ta (A) ~ ~ ~ ~ ~ ~ ~ ~ * J " ~ 

D \ melanogaster (B) ' KIRYARELLKQRELEQQQTTGGSNAQQQVEA^^ 

M sexta (B) — — — — 

D ".melanogaster (C) TAAARKFVVSTSTRHVNVIASNHFQQQQQQHQAQQHQQQH^ 

C . f umi f e ran a ~ ~ ~ "* ~ ~ ~ " " ~ ~ ~ ~ ~ " ~ ~ ~ ~ ~SJJJS~~S"SS~"SSSSSSS„ 

G . mel lonel 1 a ~ ^ ~ ~ ~ ~ " ~ ~ " ^ ~ _ " ^ 1 1 7 Zl 1 1 1 1 1 1 " 1 1 1 II 1 1 - 

M . ens i s ~ ~ ~ ~ ~ ~ ~ " ~ ~ " ~ - — — - - 

H. virescens 
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D . melanogaster (A) QQQMPQHFESLPHHHPQQEHQPQQQQQQHHLQHHPHPHVMYPHGYQQANLHHSGGIAVVP 
M . sexta (A) 

D . melanogaster (B) LLRQQSQQQQ WATQQQQQQQQQHQHQQQRRDS SDSNCSLMSNS SNS S AGNCCTCNAGDD 

M . sexta (B) . 

D. melanogaster (C) GSSSSHIFRTPWSSSSSSKMHHQQQQQQQQSSLGNSVMRPPPPPPPPKVKHAS 

C fumif erana MTLVMSP 

G . mellonella MTLVMSP 

M . ensi s 

H . vires cens PPAARAPSARLMRLPPLPLPDMTVTECQRRLLEP 

D . melanogaster (A) ADSRPQTPEYIKS YPVMDTTVASSVKGEPELNIEFDGTTVLCRVCGDKASGFHYGVHSCE 

M . sexta (A) FDGTTVLCRVCGDKASGFHYGVHSCE 

D .melanogaster (B) QQLEEMDEAHDSGCDDELCEQHHQRLDSSQLNYLCQKFDEKLDTALSNSSA. NTG. RNTP 

M . sexta (B) MVRAMSCGAELRERHSVLVSMLEARRESSDSGCSSDDGSDVER 

D. melanogaster (C) SSSSGNSSSSNTNNSSSSSNGEEPSSSIPDL . . EFDGTTVLCRVCGDKASGFHYGVHSCE 

C . f umi f erana DSSYGRYDAQPPVDGGMVNPVHR . . ERE PELH I EFDGTTVLCRVCGDKASGFHYGVHSCE 

G. mellonella DSSYGRYDAPAPADNRIMSPVHK. . EREPELHI EFDGTTVLCRVCGDKASGFHYGVHSCE 
M . ensis MFCDQDMYEIPADCQVLVDKTVIEFDGTTVLCRVCGDKASGFHYGVHSCE 

H . virescens SAAEPPPPAPPTDSDVLLGRV LAE FDGTTVLCRVCGDKASGFHYGVHSCE 



D. melanogaster (A) GCK GFFRRSIQQKIQYRPCTKNQQCSILRINRNRCQYCRLKKCIAVGMSRDAVR 

M. sexta (A) GCK 

D .melanogaster (B) AVTANEDADGFFRRSIQQKIQYRPCTKNQQCSILRINRNRCQYCRLKKCIAVGMSRDAVR 

M . sexta (B) DCKCRCDPQGFFRRSIQQKIQYRPCTKNQQCSILRINRNRCQYCRLKKCIAVGMSRDAVR 

D .melanogaster (C) GCK G F FRRS I QQKI Q YRP CTKNQQ C S I LRI NRNRCQ YCRL KKC I AVGM S RDAVR 

C . fumif erana GCK GFFRRSIQQKIQYRPCTKNQQCSILRINRNRCQYCRLKKCIAVGMSRDAVR 

G. mellonella GCK GFFRRS 1 QQKIQYRPCTKNQQCS ILRI NRNRCQYCRLKKC I AVGMSRDAVR 

M . ensis GCK GFFRRSIQQKIQYRPCTKNQQCSILRINRNfRCQYCRLKKCIAVGMSRDAVR 

H. virescens GCK GFFRRSIQQKIQYRPCTKNQQCSILRINRJsIRCQYCRLKKCI AVGMSRDAVR 



D . melanogaster (A) FGRVPKREKARILAAMQQSTQNRGQQRALATELDDQPRLLAAVLRAHLETCEFTKEKVSA 

M . sexta (A) . 

D . melanogaster (B) FGRVPKREKARILAAMQQSTQNRGQQRALATELDDQPRLLAAVLRAHLETCEFTKEKVSA 

M . sexta (B) FGRVPKREKARILAAMQQSSTSRAHEQAAAAELDDAPRLLARWRAHLDTCEFTRDRVAA 

D . melanogaster (C) FGRVPKREKARIWRPCNRAPRIAASSDPSPPSWMTSHASSPPCCCAHLETCEFTKEKVSA 

C . fumif erana FGRVPKREKARILAAMQQSSSSRAHEQAAAAELDDAPRLLARWRAHLDTCEFTRDRVAA 

G . mellonella FGRVPKREKARILAAM . QSSTTRAHEQAAAAELDDGPRLLARWRAHLDTCEFTRDRVAA 

M . ensis FGRVPKREKAKILAAM . QSVNAKSQERAVLAELEDDTRVTAAIIRAHMDTCDFTRDKVAP 

H . Virescens FGRVPKREKARILAAMQQSSTSRANEQAAAAELDDAPRLLARWRAHLDTCEFTRDRVAA 

D . melanogaster (A) MRQRARDCPSYSM . PTLLACPLNPAP . ELQSEQE .... FSQRFAHVIRGVIDFAGMIPGF 

j^j sexta (^^) " ~ ~ ™ '**' 

D. melanogaster (B) MRQRARDCPSYSM . PTLLACPLNPAP . ELQSEQE FSQRFAHVIRGVIDFAGMIPGF 

M . sexta (B) MRARARDCPTYSQ . PT . LACPLNPAP . ELQSEKE .... FSQRFAHVIRGVIDFAGLIPGF 

D. melanogaster (C) MR . HGRGLPSTPC . HTS . GLSAEPAP . ELQSEQE . . . . FSQRFAHVIRGVIDFAGMIPGF 

C . fumif erana MRARARDCPTYSQ . PT . LACPLNPAP . ELQSEKE .... FSQRFAHVIRGVIDFAGLIPGF 

G . mellonella MRNGARDCPTYSQ . PT . LACPLNPAP . ELQSEKE FSQRFAHVIRGVIDFAGLIPGF 

M . ensis MLQQARTHPSYTQCPPYLACPLNPRPVPLHGQQELVQDFSEALLPAIRGWEFAKRLPGF 

H. virescens MRARARDCPIYSQ . PT. LACPLNPAP . ELQSEKE . . . . FSQRFAHVIRGVIDFAGLIPGF 

D . melanogaster (A) QLLTQDDKFTLLKAGLFDALFVRLICMFDSSINSIICLMGQVMRRDAIQNG^ARFLVDS 

M se^cta (Al) "* 

D. melanogaster (B) QLLTQDDKFTLLKAGLFDALFVRLICMFDSSINSIICLNGQVMRRDAIQNGANARFLVDS 

M . sexta (B) QL.LTQDDKFTLLKSGLFDALFVRLICMFDAPLNSIICLNGQLMKRDSIQSGANARFLVDS 

D. melanogaster (C) QLLTQDDKFTLLKAGLFDALFVRLICMFDSSINSIICLNGQVMRRDAIQNGANARFLVDS 

c. fumif erana qlltqddkftllksglfdalfvrlicmfdaplnsiiclngqlmkrdsiqsganarflvds 

g. mellonella qlltqddkftllksglfdalfvrlicmfdaplnsiiclngqlmkrdsiqsganarflvds 
M . ensis qqlpqedqvtllkagvfevllvrlagmfdartnamlclngqlvrrealhtsvnarflmds 

h. virescens qlltqddkftllksglfdalfvrlicmfdaplnsiiclngqvmkrdsiqsganarflvds 
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D. melanogaster (A) 

M. sexta (A) 

D . melanogaster (B) 

M. sexta (B) 

D .melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 

D. melanogaster (A) 
M. sexta (A) 

D . me 1 anogas t er ( B ) 

M. sexta (B) 

D. melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 

D. melanogaster (A) 
M. sexta (A) 

D , melanogaster (B) 

M. sexta (B) 

D. melanogaster (C) 

C. fumif erana 

G. mellonella 
M . ensis 

H. virescens 

D. melanogaster (A) 
M. sexta (A) 

D .melanogaster (B) 
M. sexta (B) 
D . melanogaster (C) 
C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 



tfnfaermnsmnltdaeiglfcaivlitpdrpglrnleliekmysrlkgclqyivaqnrp 

tfn faermnsmnltdae i glfcai vli tpdrpglrnleli ekm ysrlkg clqy'i vaqnr p 
tfkfaermnsmnltdaeiglfgklvlitpdrpglrnvelvermhtrlkaclqtviaqnrp 
tfnfaermnsmnltdaeiglfcaivlitpdrpglrnleliekmysrlkgclqyivaqnrp 
tfkfaermnsmnltdaeiglfcaivlitpdrpglrnielvermharlksclqtviaqnra 

tfkfaermnsmnltdaeiglfcaivlitpdrpglrnvelvermhsrlksclqtviaqnrs 
mfdfaervnslai^ndaelalfcavw:^ 

tfkfaermnsmnltdaeialfcaivlitpdrpglrnvelvermharlkaclqtwaqnr 

dqpeflaklletmpdlrtlstlhteklwfr . tehkellrqqmwsmedgnnsdgqqnksp 

dqpeflaklletmpdlrtlstlhteklvvfr . tehkellrqqmwsmedgnnsdgqqnksp 

drpgflrelmdtlpdlrtlstlhteklwfr. tehkellrqqmwseee 

dqpeflaklletmpdlrtlstlhteklwfr . tehkellrqqmwsmedgnnsdgqqnksp 

drpgflrelmdtlpdlrtlstlhteklwfr . tehkellrqqmwgdeev 

dgpgflrelmdtlpdlrtlstlhteklwfr . tehkellrqqmwvedeg 

enpnlqrdlls ki pdlrtlntlhs e kllkykmteh . . taagapwddsrsswsmeqessvg 
drpgflrelmdtlpdlrtlstlhteklwfr . tehkellrqqmwtdeeg 

sgswadamdveaaksplg. . . svsstesadldygspsssqpqgvslpsppqqqpsalass 



SGS WADAMDVEAAKS PLG . 
AVSWVDSGADELARSPIG . 
SGSWADAMDVEAAKSPLG . 
CP . WADSGVDDS ARS PLG . 
AL . WADSGADDSARSPIG . 



. SVS STESADLDYGS P S S SQPQGVSLPS P PQQQPS ALAS S 

. SVSSSESGE .* AVGDCG 

.SVSSTESADLDYGSPSSSQPQGVSLPSPPQQQPSALASS 

. SVSSSESGE .APSDCG 

. SVSSSESSE TTGDCG 

SPS . SSYTTDEAMRSPVSCSESICSGESASSGESLCGSEVSGYTELRPPFPLARRRHDHS 
VMSWGDSGADESARSPIG. . . SVSSSESGE AVGDCG 



APLLAATLSGGCPLRNRANSGSSGDSGAAEMDIVGSHA. HLTQNGLTITPIVRHQQQQQQ 

APLLAATLSGGCPLRNRANSGSSGDSGAAEMDIVGSHA . HLTQNGLTITPIVRHQQQQQQ 

TPLLAATLAG. . . . RRRLDSRGSVDEEALGV A. HLAHNGLTVTPV 

APLLAATLSGGCPLRNRANSGSSGDSGAAEMDIVGSHA . HLTQNGLTITPIVRHQQQQQQ 

TPLLAATLAG. . . .RRRLDSRGSVDEEALGV A . HLAHNGLTVTPV 

TPLLAATLAG. . . . RRRLDSRGSVDEEALGV A . HLAHNGLTVTPV 

EGASSGDEATESPLKCPFSKRKSDSPDDSGIESGTDRSDKLSSPSVCSSPRSSIDEKERG 
TPLLAATLAG RRRLDSRGSVDEEALGV A . HLAHNGLTVTPV 



D. melanogaster (A) 

M. sexta (A) 

D .melanogaster (B) 

M. sexta (B) 

D. melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 



QQQIGILNNAHSRNLNGGHAMCQQQQQHPQLHHHLTAGAARYRKLDSPTDSGIESGNEK. 

QQQIGILNNAHSRNLNGGHAMCQQQQQHPQLHHHLTAGAARYRKLDSPTDSGIESGNEK. 

RQPPRYRKLDSPTDSGIESGNEK. 

QQQIGILNNAHSRNLNGGHAMCQQQQQHPQLHHHLTAGAARYRKLDSPTDSGIESGNEK. 

. . . RPPPRYRKLDSPTDSGIESGNEK. 

RPPPRYRKLDSPTDSGIESGNEK . 

GPARTICRCCARLQRRPSSTRICSWRKPTTSPIKSSVRNVGKRSLTPHSPPPPRSWSRRC 
RPPPRYRKLDSPTDSGIESGNEK. 



D . melanogaster (A) 

M . sexta (A) 

D .melanogaster (B) 

M. sexta (B) 

D. melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 



. . .NECKAVSSGGSSSCSSP.RSSVDDALDCSDAAANHNQWQHPQLSWSVSPVRSPQP 

. . .NECKAVSSGGSSSCSSP.RSSVDDALDCSDAAANHNQWQHPQLSWSVSPVRSPQP 

. . .HE.RIV. . GTGSGCSSP . RSSLEEHNE DRRPPV 

. . .NECKAVSSGGSSSCSSP.RSSVDDALDCSDAAANHNQWQHPQLSWSVSPVRSPQP 

. . .HE.RIV. .GPGSGCSSP. RSSLEEHME DRRPLA 

. . .HE.RIV. .GPESGCSSP.RSSLEEHSD DRRPIA 

LSLHSTRALWLRHTPPWPPVWRRPLA~ "* ' 

. . .HE.RIV. . GPGSGCSSP. RSSLEEHTD DRRPPP 
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D.melanogaster (A) 
M . sexta (A) 
D.melanogaster (B) 
M . sexta (B) 
D . melanogas ter (C) 

C . fumif erana 

G . mellonella 
M . ensis 

H. virescens 

D . melanogas ter (A) 
M. sexta (A) 
D.melanogaster (B) 
M. sexta (B) 
D.melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 

D. melanogaster (A) 
M. sexta (A) 
D.melanogaster (B) 
M. sexta (B) 
D.melanogaster (C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 

D. melanogaster (A) 
M. sexta (A) 
D.melanogaster (B) 
M. sexta (B) 
D.melanogaster <C) 

C. fumif erana 

G. mellonella 
M. ensis 

H. virescens 

D . melanogaster (A) 
M. sexta (A) 
D.melanogaster (B) 
M . sexta (B) 

D . melanogaster (C) 

C . fumif erana 

G . mellonella 
M . ensis 

H . virescens 

. D . melanogaster (A) 
M . sexta (A) 

D . melanogaster (B) 
M. sexta (B) 
D.melanogaster (C) 
C . fumif erana 

G. mellonella 
M . ensis 

H. virescens 



STSSHLKRQIVEDMPVLKRVLQAPPLY . DTNSLMDEAYKPHKKFRALRKREFETAEADAS 

STSSHLKRQI VEDMPVLKRVLQAPPLY . DTN S LMD E AY KP H KKFRAL RH RE FETAE AD A S 

S. . . ADDMPVLKRVLQAPPLYGGTPSLMDEAYRRHKKFRALjRRDT . GEAEA . . . 

STSSHLKRQIVEDMPVLKRVLQAPPLY . DTNSLMDEAYKPHKKFRALRHREFETAEADAS 

ADDMPVLKRVLQAP PLY. DASSLMDEAYKPHKKFRAMRRDT. GEAEA. . . 

p ADDMPVLKRVLQAPPLY . DASSLMDEAYKPHKKFRAMRRDTWSEAEA^ . . 

S ADDMPVLKRVLEAPPLY . HTTSLMDEAYKPHKKFRAMRRDT . GEAEA . . . 

SSTSGSNSLSAGSPRQSPVPNSVATPPPSAASAAAGNPAQSQLHMHLTRSSPKAS^SSH 

SSTSGSNSLSAGSPRQSPVPNSVATPPPSAASAAAGNPAQSQLHMHLTRSSPKASMASSH 

RTVRPTPSPQP. . .QHPHP . . . . ANPAHPAHSPRP QRASLSSTH 

SSTSGSNSLSAGSPRQSPVPNSVATPPPVAASAAAGNPAQSQLHMHLTRSSPKASMASSH 

RPMRPTPSPQPMHPHPGSP. . . . AHPAHPAHSPRPL RAPLSSTH 

RPGRPTPSPQP '. . PHHPHP . . . . ASPAHPAHSPRPI . RAPLSSTH 

RWRPAPSTQP. .PQHPHP. . . . ASPAHPAHSPRPL RASLSSTH 

S VLAKSLMAE PRMTP EQMKRSD 1 1 QN YLKRENS TAAS S . • TTNGVGNRSPSSSSTPPPSA 

S VLAKS LMAE PRMT P EQMKRSD 1 1 QNYLKRENS TAAS S . . TTNGVGNRSPSSSSTPPPSA 
SVLAKSLMEGPRMTPEQLKRTDIIQQYMRRGES SAPAEGCPLRAGGLLTCYRGASPAPQP 
SVLAKSLMAEPRMTPEQMKRSDIIQNYLKRENSTAASS. . TTNGLGNRSPSSSSTPPPS . 

SVLAKSLMEGPRMTPEQLKRTDI IQQYMRRGEAGE ECRAGLLL . . YRGASP . . . . 

SVLAKSLMEGPRMTPEQLKRTDIIQQYMRRGETGAPTEGCPLRAGGLLTCFRGASPAPQP 

SVLAKSLMEGPRMTPEQLKRTDI IQQYMRRGEAGAP . DGCPMRTGGLLTCYRGASPAPQP 
VQNQQRWGSSSVITTTCQQRQQSVSPHSNGSSSSSSSSSSSS^ 

VQNQQRWGSSSVITTTCQQRQQSVSPHSNGSSSSSSSSSSSSSSSSSTSSNCSSSSASSC 
VLALQVDVTDA .... PLNLSKKS PS P PRTYM PQMLEA 

VQNQQRWGSSSVITTTCQQRQQSVSPHSNGSSSSSSSSSSSSSSSSS.TSSNCSSSSASSC 

. . . LQVDVA . . DAPQPLNLSKKSPS P PRSFM PPMLEA 

VIALQVDVAETDAPQPLNLSKKSPSPSPPPPPPRSYMPPML^ 

VMALQVDVSDADA . . PLNLSKKSPS . .... . P PRSFM PQMLEA 

QYFQS PHSTSNGTS APAS S S SGSNS ATPLLELQVD IADSAQPIjNLjS KKS PTP PP SKLHAL 
Q YFQS PHSTSNGTS APAS S S SGSNS ATPLLELQVD IADS AQPLNLS KKS PTP PP S 
QYFQSPHSTSIGTGEPDGAPVRDRTAPRPCWNCRWTLLTRRTSQFVQEIAHAAAQQAARS 



'VAAANAVQRYPTLSADVTVTASNGGSSVGGGESGRQQQSA 

VAAANAVQRYPTLSADVTVTASNGGSSVGGGESGRQQQSAGECGLP^ 

GGRRQCRSKVSHIVRRRHSDSLQWRSSVGGGESGAQQQSAGECGLPQSGPERRRAQGNAG 
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D.melanogaster (A) 
M . sexta (A) 
D.melanogaster (B) 
M . sexta (B) 
D .melanogaster (C) 

C . f umif erana 
G.mellonella 
M . ensis 
i H.virescens 

D .melanogaster (A) 
M . sexta (A) 

D. melanogaster (B) 
M . sexta <B) 

D .melanogaster (C) 
C. f umif erana 

G. mellonella 
M . ensis 

H . virescens 



GVRAGGGRWFYAEKWERQRLGVAVQRSRKQDHLERRELN*IILPFN*DVYKV*KQNQHAC 



GVRAGGGRWFYAEKWERQRLGVAVQRSRKQDHLERRELN' 



GVRAGGGRWFYAEKWERQRLGVAVQRSRKQDHLERRELN' 



NLKLIFKATTNKTTTSY*FKKQTNKQTTKNPSLNGITKEKEKQKKYKYIIiAVKL*RSKKP 



D .melanogaster (A) 
M . sexta (A) 
D.melanogaster (B) 
M. sexta (B) 
D .melanogaster (C) 
C . fumif erana 

G . mellonella 
M . ensis 

H. virescens 



TNPRQRSDFAliTFLQLLPKTPLTSPPPNPSSTHQPSFDP*LFYKF*ArjVVHINYVYV7*IjC 



D.melanogaster (A) LAL* Li * LEQNYFAFLDVF * KNCKLLLLNF * I PKNKTM CVKFF I VRS P S RMKC SLQQ I LTT 

M. sexta (A) ' ~ 

D.melanogaster (B) '. ' . 

M.sexta(B) ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - -- -- -- -- -- -- -- - - 

D . melanogaster (c)' 

C. fumif erana r \ ------- - - - - - - - - - - - - - - ------ - - - - ~ - - - - . 

G . me 1 1 one 11a 

M . ens is' • 

K. virescens ' ' 

D. melanogaster (A) IKLITIHFIi*I*LIL*ICYSFPPFYRSFYLII*LPVFLISPLAQSSSLC*RIKWNKYCFL 
M . s ext a (A) 

D . melanogaster (B ) -------------- 

M . sexta (B) 

D . mel anogas t er(C) • 

C . f umi f e r ana ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ " ~ ~ ~ ~ " ~ ~ ~ ~ ~ ~ " ~ 
G ♦ mel lonella • 

M . ens is' . 

H. virescens 

D . melanogaster (A) ILKLPQKYD*NIHEVIENQTKCLKF*QQAVKRR*RRETQR*IYLLCT*LNVKLNTKTYLK 
M . sexta (A) 

D . me 1 anogas t er { B ) 
M . sexta { B ) 

D . melanogas t er { C) -------------------- - - - ------------------------------------- 

C . f umi f e r ana . ~ ~ ~ ~ ~ " ~ ~ ~ ~ " " ~ ~ ~ ~ ~ ~ ~ ~~~~ ~ ~ ~ ~ ~ " ~ ~ " ~ ' ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

G . me 1 lonel la . ---------- 

M . ens i s ~ " ~ ~ ~ " ~ " ~ ~ ~ ~ ~ ~ ------ - ~ ~ - - - - - - - - - - - - - 

H. virescens 

Figure 4E 
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D.melanogaster (A) 
M . sexta (A) 
D.melanogaster (B) 
M. sexta (B) 
D.melanogaster (C) 
C. fumif erana 

G. rnellonella 
M . ensis 

H . virescens 



YI*IHIIINEETYA*KIQCLIGILENQAKNTKKNQQTKIMIYYLKVKYTFTLQKNKRENL 



D . melanogas ter (A) 

M . sexta (A) 

D .melanogas ter (B) 

M. sexta (B) 

D .melanogas.ter (C) 

C. fumif erana 

G .mellonella 

M . ensis 

H .virescens 



R * QQNYY INYI L I MLYYYD Y * LL * Li I N YD F YA * TNQQKTNMQKPLKKKTKNKQKI TLiAQK 



D .melanogas ter (A) 

M . sexta (A) 

D .melanogas ter (B) 

M. sexta (B) 

D .melanogaster (C) 

C. fumif erana 

G. rnellonella 
M . ensis 

H. virescens 



FVLSNQEKICLKISKESLLSFFISISFSVEHFFIjKCSVLPIjLCFGWRFWVSrjDFIFvXjT 



D.melanogaster (A) 
M. sexta (A) 
D.melanogaster (B) 
M-sexta {B) 
D.melanogaster (C) 
C. fumif erana 

G. rnellonella 
M. ensis 

H. virescens 



IVL* KYLLNYI ES I YI 1 1 K*TR' 



Figure 4F 



11/13 



WO 02/077157 



PCT7US02/11257 



FIG. 5 
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60 




w1118 DHR38 DHR39 E75A 

Receptor 
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Sequence Listing 

<110> Syngenta 

Broadus, Julie 
Brown, Blanche 
Stam, Lynn 
Kamdar , Kim 

<12 0> INSECT NUCLEAR RECEPTOR GENES AND USES THEREOF 

<130> 1392/3 

<150> US 60/278,336 

<151> 2001-03-23 

<160> 56 

<170> Patentln version 3.0 

<210> 1 

<211> 1560 

<212> DNA 

<213> Drosophila melanogaster 

<400> 1 



atqctqaqqa 


acgaatacat 


aaqcaataac 


aaqaaqqtqc 


tcaactcgga 


ccaaaacaaq 


60 


tactacatgc 


taacggtcga 


ggaggccgat 


atccaaggca 


cgaacatgtc 


cgacggcgtc 


120 


agcatcttgc 


acatcaaaca 


ggaggtggac 


actccatcgg 


cgtcctgctt 


tagtcccagc 


180 


t ccaagtcaa 


cggccacgca 


gagtggcaca 


aacggcctga 


aatcctcgcc 


ctcggttt eg 


240 


ccggaaaggc 


agctctgcag 


ctcgacgacc 


tctctatcct 


gcgatttgca 


caatgtatcc 


300 


ttaagcaatg 


atggcgatag 


tctgaaagga 


agtggtacaa 


gtggcggcaa 


tggcggagga 


360 


ggaggtggtg 


gtacgagtgg 


tggaaatgcg 


accaatgcga 


gtgccggagc 


tggatcggga 


420 


tccgtcaggg 


acgagctccg 


ccgattgtgt 


ttggtttgtg 


gcgatgtggc 


cagtggattc 


480 


cactatggtg 


tggcgagttg 


tgaggcttgc 


aaagcgttct 


ttaaacgcac 


catccaaggc 


540 


aacatcgagt 


acacgtgtcc 


ggcgaacaac 


gagtgtgaga 


ttaacaagcg 


gagaegcaag 


600 


gcctgccaag 


cgtgtcgctt 


ccagaaatgt 


ctactaatgg 


gcatgctcaa 


ggagggtgtg 


660 


cgcttggatc 


gagttcgtgg 


aggacggcag 


aagtaccgaa 


ggaatcctgt 


atcaaact ct 


720 


taccagacta 


tgcagctgct 


ataccaatcc 


aacaccacct 


cgctgtgcga 


tgtcaagata 


780 


ctggaggtgc 


tcaattcata 


tgagccggat 


gccttgagcg 


tccaaacgcc 


gccgccgcaa 


840 


gtccacacga 


ctagcataac 


taatgatgag 


gcctcatcct 


cctcgggcag 


cataaaactg 


900 


gagtccagcg 


ttgttacgcc 


caatgggact 


tgcattttcc 


aaaacaacaa 


caacaatgat 


960 


cccaatgaga 


tactaagcgt 


ccttagtgat 


atttacgaca 


aggaattggt 


cagegtcatt 


1020 


ggctgggcca 


agcagatacc 


tggctttata 


gatctgccac 


ttaacgacca 


gatgaagctt 


1080 
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ctccaggtgt cgtgggcaga gatcctgacg ctccagctga ccttccggtc cctaccgttc 114 0 

aatggcaagt tatgcttcgc cacggatgtc tggatggatg aacatttggc caaggagtgc 12 00 

ggttacacgg agttctacta ccactgcgtc cagatcgcac agcgcatgga aagaatatcg 1260 

ccacgaaggg aggagtacta cttgctaaag gcgctcctgc tggccaactg cgacattctg 1320 

ctggatgatc agagttccct gcgcgcattt cgtgatacga ttcttaattc tctaaacgat 1380 

gtggtctact tgctgcgtca ttcgtcggcc gtgtcgcatc agcaacaatt gctgcttttg 1440 

ctgccttcgc tgcggcaggc ggatgatatc ctgcgaagat tttggcgtgg aattgcacgc 1500 

gatgaagtca ttaccatgaa gaaactgttc ctcgagatgc tcgagccgct ggccaggtga 1560 



<210> 2 
<211> 519 
<212> PRT 

<213> Drosophila melanogaster 
<400> 2 

Met Leu Arg Asn Glu Tyr lie Ser Asn Asn Lys Lys Val Leu Asn Ser 
15 10 15 

Asp Gin Asn Lys Tyr Tyr Met Leu Thr Val Glu Glu Ala Asp lie Gin 

20 25 30 

Gly Thr Asn Met Ser Asp Gly Val Ser lie Leu His lie Lys Gin Glu 
35 40 45 

Val Asp Thr Pro Ser Ala Ser Cys Phe Ser Pro Ser Ser Lys Ser Thr 
50 55 60 

Ala Thr Gin Ser Gly Thr Asn Gly Leu Lys Ser Ser Pro Ser Val Ser 
65 70 75 80 

Pro Glu Arg Gin Leu Cys Ser Ser Thr Thr Ser Leu Ser Cys Asp Leu 

85 90 95 

His Asn Val Ser Leu Ser Asn Asp Gly Asp Ser Leu Lys Gly Ser Gly 

100 105 110 

Thr Ser Gly Gly Asn Gly Gly Gly Gly Gly Gly Gly Thr Ser Gly Gly 
115 120 125 

Asn Ala Thr Asn Ala Ser Ala Gly Ala Gly Ser Gly Ser Val Arg Asp 
130 135 140 

Glu Leu Arg Arg Leu Cys Leu Val Cys Gly Asp Val Ala Ser Gly Phe 

145 150 155 160 

His Tyr Gly Val Ala Ser Cys Glu Ala Cys Lys Ala Phe Phe Lys Arg 

165 170 175 

Thr He Gin Gly Asn lie Glu Tyr Thr Cys Pro Ala Asn Asn Glu Cys 

180 185 190 
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Glu lie Asn Lys 
195 

Lys Cys Leu Leu 
210 

Val Arg Gly Gly 
225 

Tyr Gin Thr Met 



Asp Val Lys lie 

260 

Ser Val Gin Thr 
275 

Asp Glu Ala Ser 
290 

Val Thr Pro Asn 
305 

Pro Asn Glu lie 



Val Ser Val lie 

340 

Pro Leu Asn Asp 
355 

Leu Thr Leu Gin 
370 

Cys Phe Ala Thr 
385 

Gly Tyr Thr Glu 



Glu Arg lie Ser 

420 

Leu Leu Ala Asn 
435 

Ala Phe Arg Asp 
450 

Leu Arg His Ser 
465 

Leu Pro Ser Leu 



Gly lie Ala Arg 

500 



Arg Arg Arg Lys 

200 

Met Gly Met Leu 
215 

Arg Gin Lys Tyr 
230 

Gin Leu Leu Tyr 

245 

Leu Glu Val Leu 



Pro Pro Pro Gin 

280 

Ser Ser Ser Gly 
295 

Gly Thr Cys lie 
310 

Leu Ser Val Leu 
325 

Gly Trp Ala Lys 



Gin Met Lys Leu 

360 

Leu Thr Phe Arg 
375 

Asp Val Trp Met 
390 

Phe Tyr Tyr His 
405 

Pro Arg Arg Glu 



Cys Asp lie Leu 

440 

Thr lie Leu Asn 
455 

Ser Ala Val Ser 
470 

Arg Gin Ala Asp 
485 

Asp Glu Val lie 



Ala Cys Gin Ala 



Lys Glu Gly Val 

220 

Arg Arg Asn Pro 
235 

Gin Ser Asn Thr 

250 

Asn Ser Tyr Glu 
265 

Val His Thr Thr 



Ser lie Lys Leu 

300 

Phe Gin Asn Asn 
315 

Ser Asp lie Tyr 
330 

Gin lie Pro Gly 
345 

Leu Gin Val Ser 



Ser Leu Pro Phe 

380 

Asp Glu His Leu 
395 

Cys Val Gin lie 
410 

Glu Tyr Tyr Leu 
425 

Leu Asp Asp Gin 



Ser Leu Asn Asp 

460 

His Gin Gin Gin 
475 

Asp lie Leu Arg 
490 

Thr Met Lys Lys 
505 



Cys Arg Phe Gin 
205 

Arg Leu Asp Arg 



Val Ser Asn Ser 

240 

Thr Ser Leu Cys 
255 

Pro Asp Ala Leu 
270 

Ser lie Thr Asn 
285 

Glu Ser Ser Val 



Asn Asn Asn Asp 

320 

Asp Lys Glu Leu 
335 

Phe lie Asp Leu 
350 

Trp Ala Glu lie 

365 

Asn Gly Lys Leu 



Ala Lys Glu Cys 

400 

Ala Gin Arg Met 
415 

Leu Lys Ala Leu 
430 

Ser Ser Leu Arg 
445 

Val Val Tyr Leu 



Leu Leu Leu Leu 

480 

Arg Phe Trp Arg 
495 

Leu Phe Leu Glu 
510 
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Met Leu Glu Pro Leu Ala Arg 
515 

<210> 3 
<211> 1455 
<212> DNA 

<213> Drosophila melanogaster 

<400> 3 

atgtccgacg gcgtcagcat cttgcacatc aaacaggagg tggacactcc atcggcgtcc 60 

tgctttagtc ccagctccaa gtcaacggcc acgcagagtg gcacaaacgg cctgaaatcc 120 

tcgccctcgg tttcgccgga aaggcagctc tgcagctcga cgacctctct atcctgcgat 180 

ttgcacaatg tatccttaag caatgatggc gatagtctga aaggaagtgg tacaagtggc 24 0 

ggcaatggcg gaggaggagg tggtggtacg agtggtggaa atgcgaccaa tgcgagtgcc 3 00 

ggagctggat cgggatccgt cagggacgag ctccgccgat tgtgtttggt ttgtggcgat 3 60 

gtggccagtg gattccacta tggtgtggcg agttgtgagg cttgcaaagc gttctttaaa 420 

cgcaccatcc aaggcaacat cgagtacacg tgtccggcga acaacgagtg tgagattaac 480 

aagcggagac gcaaggcctg ccaagcgtgt cgcttccaga aatgtctact aatgggcatg 540 

ctcaaggagg gtgtgcgctt ggatcgagtt cgtggaggac ggcagaagta ccgaaggaat 600 

cctgtatcaa actcttacca gactatgcag ctgctatacc aatccaacac cacctcgctg 660 

tgcgatgtca agatactgga ggtgctcaat tcatatgagc cggatgcctt gagcgtccaa 720 

acgccgccgc cgcaagtcca cacgactagc ataactaatg atgaggcctc atcctcctcg 780 

ggcagcataa aactggagtc cagcgttgtt acgcccaatg ggacttgcat tttccaaaac 840 

aacaacaaca atgatcccaa tgagatacta agcgtcctta gtgatattta cgacaaggaa 900 

ttggtcagcg tcattggctg ggccaagcag atacctggct ttatagatct gccacttaac 960 

gaccagatga agcttctcca ggtgtcgtgg gcagagatcc tgacgctcca gctgaccttc 1020 

cggtccctac cgttcaatgg caagttatgc ttcgccacgg atgtctggat ggatgaacat 1080 

ttggccaagg agtgcggtta cacggagttc tactaccact gcgtccagat cgcacagcgc 1140 

atggaaagaa tatcgccacg aagggaggag tactacttgc taaaggcgct cctgctggcc 12 00 

aactgcgaca ttctgctgga tgatcagagt tccctgcgcg catttcgtga tacgattctt 1260 

aattctctaa acgatgtggt ctacttgctg cgtcattcgt cggccgtgtc gcatcagcaa 1320 

caattgctgc ttttgctgcc ttcgctgcgg caggcggatg atatcctgcg aagattttgg 1380 

cgtggaattg cacgcgatga agtcattacc atgaagaaac tgttcctcga gatgctcgag 1440 

ccgctggcca ggtga 1455 
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<210> 4 

<211> 484 

<212> PRT 

<213> Drosophi 

<400> 4 

Met Ser Asp Gly 
1 

Pro Ser Ala Ser 

20 

Ser Gly Thr Asn 
35 

Gin Leu Cys Ser 
50 

Ser Leu Ser Asn 
65 

Gly Asn Gly Gly 



Asn Ala Ser Ala 

100 

Arg Leu Cys Leu 
115 

Val Ala Ser Cys 
130 

Gly Asn lie Glu 
145 

Lys Arg Arg Arg 



Leu Met Gly Met 

180 

Gly Arg Gin Lys 
195 

Met Gin Leu Leu 
210 

lie Leu Glu Val 
225 

Thr Pro Pro Pro 



Ser Ser Ser Ser 

260 

Asn Gly Thr Cys 
275 



a melanogaster 



Val Ser lie Leu 
5 

Cys Phe Ser Pro 



Gly Leu Lys Ser 

40 

Ser Thr Thr Ser 
55 

Asp Gly Asp Ser 
70 

Gly Gly Gly Gly 
85 

Gly Ala Gly Ser 



Val Cys Gly Asp 

120 

Glu Ala Cys Lys 
135 

Tyr Thr Cys Pro 
150 

Lys Ala Cys Gin 
165 

Leu Lys Glu Gly 



Tyr Arg Arg Asn 

200 

Tyr Gin Ser Asn 
215 

Leu Asn Ser Tyr 
230 

Gin Val His Thr 
245 

Gly Ser lie Lys 



lie Phe Gin Asn 

280 



His He Lys Gin 
10 

Ser Ser Lys Ser 
25 

Ser Pro Ser Val 



Leu Ser Cys Asp 

60 

Leu Lys Gly Ser 
75 

Gly Thr Ser Gly 
90 

Gly Ser Val Arg 
105 

Val Ala Ser Gly 



Ala Phe Phe Lys 

140 

Ala Asn Asn Glu 
155 

Ala Cys Arg Phe 
170 

Val Arg Leu Asp 
185 

Pro Val Ser Asn 



Thr Thr Ser Leu 

220 

Glu Pro Asp Ala 
235 

Thr Ser He Thr 
250 

Leu Glu Ser Ser 
265 

Asn Asn Asn Asn 



Glu Val Asp Thr 
15 

Thr Ala Thr Gin 
30 

Ser Pro Glu Arg 
45 

Leu His Asn Val 



Gly Thr Ser Gly 

80 

Gly Asn Ala Thr 

95 

Asp Glu Leu Arg 
110 

Phe His Tyr Gly 
125 

Arg Thr He Gin 



Cys Glu He Asn 

160 

Gin Lys Cys Leu 
175 

Arg Val Arg Gly 
190 

Ser Tyr Gin Thr 
205 

Cys Asp Val Lys 



Leu Ser Val Gin 

240 

Asn Asp Glu Ala 
255 

Val Val Thr Pro 
270 

Asp Pro Asn Glu 
285 
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He Leu Ser Val Leu Ser Asp He Tyr Asp Lys Glu Leu Val Ser Val 
290 295 300 

He Gly Trp Ala Lys Gin He Pro Gly Phe He Asp Leu Pro Leu Asn 
305 310 315 320 

Asp Gin Met Lys Leu Leu Gin Val Ser Trp Ala Glu He Leu Thr Leu 

325 330 335 

Gin Leu Thr Phe Arg Ser Leu Pro Phe Asn Gly Lys Leu Cys Phe Ala 

340 345 350 

Thr Asp Val Trp Met Asp Glu His Leu Ala Lys Glu Cys Gly Tyr Thr 
355 360 365 

Glu Phe Tyr Tyr His Cys Val Gin He Ala Gin Arg Met Glu Arg He 
370 375 380 

Ser Pro Arg Arg Glu Glu Tyr Tyr Leu Leu Lys Ala Leu Leu Leu Ala 
385 390 395 400 

Asn Cys Asp He Leu Leu Asp Asp Gin Ser Ser Leu Arg Ala Phe Arg 

405 410 415 

Asp Thr He Leu Asn Ser Leu Asn Asp Val Val Tyr Leu Leu Arg His 

420 425 430 

Ser Ser Ala Val Ser His Gin Gin Gin Leu Leu Leu Leu Leu Pro Ser 
435 440 445 

Leu Arg Gin Ala Asp Asp He Leu Arg Arg Phe Trp Arg Gly He Ala 
450 455 460 

Arg Asp Glu Val He Thr Met Lys Lys Leu Phe Leu Glu Met Leu Glu 
465 470 475 480 

Pro Leu Ala Arg 

<210> 5 
<211> 2106 
<212> DNA 

<213> Drosophila melanogaster 

<400> 5 

atgaccactg tcaaggcgga gaagccggta accaccactt gctcctactg ggtgttacgc 60 

aaaaagcgcc gctgcaagat gactgcaaac aagggcagtg agttttgtgg agcccatgcc 120 

tcatcggcag ccaccactgc aacttgcgat gaaaaagcta gagaagattc cttccaggaa 180 

cgtattccct gcccgcttga tcacaaacat accgtgttta aaagaaaact agccaaacac 240 

ttgaccattt gcaatgccag ggatcaggaa agttcactgc cctacattgt aaaaggagtg 300 

aattctggag acaatcttaa ggaaacagac gaagatttga acaaattcaa tcaaataaaa 360 

ctgcacgaac tggcggatga agagttctac agtctgatcg acaaagttaa gaatctgtat 420 

gacaagcaca tcaattccag catacaggaa ctgcagctgg agcacgaatc cctaaaggag 4 80 

gaacttagtc gcaaggacta cggccaggaa acgctacgcc agctcaccca gaccactagc 540 
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ttgctgggca tcctggaaca cgatcaccag ttgatggatc acacgagcta tatagaattc 600 

ggagccggaa agggacaatt ggcctacttt ttggccaccg tattgcagga acagaagttg 660 

agtcactcac aggtggtcct catcgatcga atgtccctgc ggcacaagaa ggacaacaag 720 

ttggccaaca gggaggtggt gcaacgtatc cgtgctgata tcgctgatct taagctttct 780 

gctctgccgg agctaaagaa aacccaacga acggtggctt tttcgaagca cctctgtgga 840 

gcagctacag acttgaccct gagatgtata ctcggtgatg gaaatgccag ctcggactac 900 

gttctcatcg ccctgtgttg tcatcatcgc tgctcctggc gctcctacgt gggacgcaag 960 

tttcttcagg aagctggaat tggaccacgg gagttcgtta tcctaaccaa aatggtcagt 1020 

tgggcagttt gtggcacggg cttaagccga gagcggcgta aggccatgga atcggctgac 10 80 

tttcagctca ccgagacgaa cacgcagcgt ctaactcgcc aggagcgcga acaaattggc 1140 

caacagtgca agcgggttct ggattatgga cgactggagc atctacgatc ccacggatat 12 00 

caagcagaac ttaagttcta tgttcccagg gatgtgactt tggaaaacca catccaggtg 1260 

gcatataaaa ccgccaatat ccgggatgag ctgattcagt ccgcggcgac tctttggctc 1320 

attggcaagc agtgccaacc ggtgaccatg tcgaacttca gtgcctgcgc agtgtgcggc 13 80 

gatcagagct ccgggaagca ctacggcgtg tcctgctgcg atgggtgctc ctgctttttc 1440 

aagcggagcg tgcggcgcgg gagcagctac gcctgcatcg ctctggtcgg gaactgtgtg 1500 

gtggacaagg cgcggcggaa ctggtgtccc tcctgccgct tccagcgatg cctggccgtg 1560 

ggaatgaacg ctgctgcggt tcaggaggag cgcggtccgc gcaaccagca ggtggctctc 1620 

taccgcactg gccggagaca agctccgcca tctcaggcgg cgccatcccc gacgccccac 1680 

tcccaggcgc tgcacttcca gatcctcgcc cagatccttg tcacgtgcct gcgccaggcg 1740 

aaggccaacg agcagttcgc tctgttggat cgctgccaac aagacgccat ctttcaggtg 1800 

gtgtggagcg agatcttcgt cctgcgagcg tcccactggt ctctggacat cagcgccatg 1860 

atcgacggct gcggcgatga gcagctcaaa cggctcattt gcgaggccca ccagctaagg 1920 

gccgacgtcc tggaactcaa ctttatggag tccctaatcc tgtgcagaaa agaattggcc 1980 

atcaatgcgg agtatgccgt tatcctggga agccactcta aagccgccct gatctcctta 2040 

gcccgctaca ccctgcagca atccaactac ctgcgggtgg tcagggacat cttaaaaaca 2100 

ctttag 2106 

<210> 6 
<211> 701 
<212> PRT 

<213> Drosophila melanogaster 
<400> 6 



7/97 



WO 02/077157 



PCT/US02/11257 



Met Thr Thr 
1 

5 Trp Val Leu 



Ser Glu Phe 
35 

10 

Cys Asp Glu 
50 

Pro Leu Asp 
15 65 

Leu Thr lie 



20 Val Lys Gly 



Leu Asn Lys 
115 

25 

Phe Tyr Ser 
130 

Asn Ser Ser 
30 145 

Glu Leu Ser 



35 Gin Thr Thr 



Asp His Thr 
195 

40 

Tyr Phe Leu 
210 

Val Val Leu 
45 225 

Leu Ala Asn 



50 Leu Lys Leu 



Ala Phe Ser 
275 

55 

Cys lie Leu 
290 

Leu Cys Cys 
60 3 05 



Val Lys Ala Glu 
5 

Arg Lys Lys Arg 
20 

Cys Gly Ala His 



Lys Ala Arg Glu 

55 

His Lys His Thr 
70 

Cys Asn Ala Arg 
85 

Val Asn Ser Gly 
100 

Phe Asn Gin lie 



Leu He Asp Lys 

135 

He Gin Glu Leu 
150 

Arg Lys Asp Tyr 
165 

Ser Leu Leu Gly 
180 

Ser Tyr He Glu 



Ala Thr Val Leu 

215 

He Asp Arg Met 
230 

Arg Glu Val Val 
245 

Ser Ala Leu Pro 
260 

Lys His Leu Cys 



Gly Asp Gly Asn 

295 

His His Arg Cys 
310 



Lys Pro Val Thr Thr 
10 

Arg Cys Lys Met Thr 
25 

Ala Ser Ser Ala Ala 
40 

Asp Ser Phe Gin Glu 

60 

Val Phe Lys Arg Lys 

75 

Asp Gin Glu Ser Ser 
90 

Asp Asn Leu Lys Glu 
105 

Lys Leu His Glu Leu 

120 

Val Lys Asn Leu Tyr 

140 

Gin Leu Glu His Glu 

155 

Gly Gin Glu Thr Leu 
170 

He Leu Glu His Asp 
185 

Phe Gly Ala Gly Lys 
200 

Gin Glu Gin Lys Leu 

220 

Ser Leu Arg His Lys 

235 

Gin Arg He Arg Ala 
250 

Glu Leu Lys Lys Thr 
265 

Gly Ala Ala Thr Asp 
280 

Ala Ser Ser Asp Tyr 

300 

Ser Trp Arg Ser Tyr 

315 



Thr Cys Ser Tyr 
15 

Ala Asn Lys Gly 
30 

Thr Thr Ala Thr 
45 

Arg He Pro Cys 



Leu Ala Lys His 

80 

Leu Pro Tyr He 
95 

Thr Asp Glu Asp 
110 

Ala Asp Glu Glu 
125 

Asp Lys His He 



Ser Leu Lys Glu 

160 

Arg Gin Leu Thr 
175 

His Gin Leu Met 
190 

Gly Gin Leu Ala 
205 

Ser His Ser Gin 



Lys Asp Asn Lys 

240 

Asp He Ala Asp 
255 

Gin Arg Thr Val 
270 

Leu Thr Leu Arg 
285 

Val Leu He Ala 



Val Gly Arg Lys 

320 
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Phe 



Lys 

5 

Arg 



10 Gin 



Arg 
385 

15 

Gin 



His 

20 

Gin 



25 Thr 



Gly 
465 

30 

Lys 



Gly 

35 

Arg 



40 Glu 



Arg 

545 

45 

Ser 



Leu 

50 

Gin 



55 Arg 



Gly 
625 

60 



Leu Gin Glu Ala Gly 

325 

Met Val Ser Trp Ala 
340 

Lys Ala Met Glu Ser 
355 

Arg Leu Thr Arg Gin 
370 

Val Leu Asp Tyr Gly 

390 

Ala Glu Leu Lys Phe 

405 

He Gin Val Ala Tyr 
420 

Ser Ala Ala Thr Leu 
435 

Met Ser Asn Phe Ser 
450 

Lys His Tyr Gly Val 

470 

Arg Ser Val Arg Arg 

485 

Asn Cys Val Val Asp 
500 

Phe Gin Arg Cys Leu 
515 

Glu Arg Gly Pro Arg 
530 

Arg Gin Ala Pro Pro 

550 

Gin Ala Leu His Phe 

565 

Arg Gin Ala Lys Ala 
580 

Gin Asp Ala He Phe 
595 

Ala Ser His Trp Ser 
610 

Asp Glu Gin Leu Lys 

63 0 



He Gly Pro Arg Glu 

330 

Val Cys Gly Thr Gly 
345 

Ala Asp Phe Gin Leu 
360 

Glu Arg Glu Gin He 
375 

Arg Leu Glu His Leu 

395 

Tyr Val Pro Arg Asp 

410 

Lys Thr Ala Asn lie 
425 

Trp Leu He Gly Lys 
440 

Ala Cys Ala Val Cys 
455 

Ser Cys Cys Asp Gly 

475 

Gly Ser Ser Tyr Ala 

490 

Lys Ala Arg Arg Asn 
505 

Ala Val Gly Met Asn 
520 

Asn Gin Gin Val Ala 
535 

Ser Gin Ala Ala Pro 

555 

Gin He Leu Ala Gin 

570 

Asn Glu Gin Phe Ala 
585 

Gin Val Val Trp Ser 
600 

Leu Asp lie Ser Ala 
615 

Arg Leu He Cys Glu 

63 5 



Phe Val He Leu Thr 

335 

Leu Ser Arg Glu Arg 
350 

Thr Glu Thr Asn Thr 
365 

Gly Gin Gin Cys Lys 
380 

Arg Ser His Gly Tyr 

400 

Val Thr Leu Glu Asn 

415 

Arg Asp Glu Leu He 
430 

Gin Cys Gin Pro Val 
445 

Gly Asp Gin Ser Ser 
460 

Cys Ser Cys Phe Phe 

480 

Cys lie Ala Leu Val 

495 

Trp Cys Pro Ser Cys 
510 

Ala Ala Ala Val Gin 
525 

Leu Tyr Arg Thr Gly 
540 

Ser Pro Thr Pro His 

560 

He Leu Val Thr Cys 

575 

Leu Leu Asp Arg Cys 
590 

Glu lie Phe Val Leu 
605 

Met lie Asp Gly Cys 
620 

Ala His Gin Leu Arg 

640 
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Ala Asp Val Leu Glu Leu Asn Phe Met Glu Ser Leu He Leu Cys Arg 

645 650 655 

Lys Glu Leu Ala He Asn Ala Glu Tyr Ala Val He Leu Gly Ser His 

660 665 670 

Ser Lys Ala Ala Leu He Ser Leu Ala Arg Tyr Thr Leu Gin Gin Ser 
675 680 685 

Asn Tyr Leu Arg Val Val Arg Asp He Leu Lys Thr Leu 
690 695 700 



<210> 7 

<211> 837 

<212> DNA 

<213> Drosophila melanogaster 

<400> 7 



atgtcgaact 


tcagtgcctg 


cgcagtgtgc 


ggcgatcaga 


gctccgggaa 


gcactacggc 


60 


^-4 l~ y * — - t— 


acciri t~ Cfcrcrt~a 


rtrrfarttt" 


lw O v_ C_J_ CA >-J <-4 




^yyy a y^ a y^ 


120 

_l_ £-i \J 


tacgcctgca 


tcgctctggt 


cgggaactgt 


gtggtggaca 


aggcgcggcg 


gaactggtgt 


180 


ccctcctgcc 


gcttccagcg 


atgcctggcc 


gtgggaatga 


acgctgctgc 


ggttcaggag 


240 


gagcgcggtc 


cgcgcaacca 


gcaggtggct 


ctctaccgca 


ctggccggag 


acaagctccg 


300 


ccatctcagg 


cggcgccatc 


cccgacgccc 


cactcccagg 


cgctgcactt 


ccagatcctc 


360 


gcccagatcc 


ttgtcacgtg 


cctgcgccag 


gcgaaggcca 


acgagcagtt 


cgctctgttg 


420 


gatcgctgcc 


aacaagacgc 


catctttcag 


gtggtgtgga 


gcgagatctt 


cgtcctgcga 


480 


gcgtcccact 


ggtctctgga 


catcagcgcc 


atgatcgacg 


gctgcggcga 


tgagcagctc 


540 


aaacggctca 


tttgcgaggc 


ccaccagcta 


agggccgacg 


tcctggaact 


caactttatg 


600 


gagtccctaa 


tcctgtgcag 


aaaagaattg 


gccatcaatg 


cggagtatgc 


cgttatcctg 


660 


ggaagccact 


ctaaagccgc 


cctgatctcc 


ttagcccgct 


acaccctgca 


gcaatccaac 


720 


tacctgcggt 


tcggacaact 


gctccttggt 


ctgaggcagc 


tgtgcctgag 


gcgcttcgac 


780 


tgcgcgcttt 


cttgtatgtt 


tcgcagcgtg 


gtcagggaca 


tcttaaaaac 


actttaa 


837 



<210> 8 
<211> 278 
<212> PRT 

<213> Drosophila melanogaster 
<400> 8 

Met Ser Asn Phe Ser Ala Cys Ala Val Cys Gly Asp Gin Ser Ser Gly 
15 10 15 

Lys His Tyr Gly Val Ser Cys Cys Asp Gly Cys Ser Cys Phe Phe Lys 

20 25 30 

Arg Ser Val Arg Arg Gly Ser Ser Tyr Ala Cys He Ala Leu Val Gly 
35 40 45 
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Asn Cys Val Val 
50 

Phe Gin Arg Cys 
65 

Glu Arg Gly Pro 



Arg Gin Ala Pro 

100 



Asp Lys Ala Arg 
55 

Leu Ala Val Gly 
70 

Arg Asn Gin Gin 
85 

Pro Ser Gin Ala 



Arg Asn Trp Cys 

60 

Met Asn Ala Ala 
75 

Val Ala Leu Tyr 
90 

Ala Pro Ser Pro 
105 



Pro Ser Cys Arg 



Ala Val Gin Glu 

80 

Arg Thr Gly Arg 
95 

Thr Pro His Ser 
110 



Gin Ala Leu His Phe Gin lie Leu Ala Gin lie Leu Val Thr Cys Leu 
115 120 125 

Arg Gin Ala Lys Ala Asn Glu Gin Phe Ala Leu Leu Asp Arg Cys Gin 
130 135 140 

Gin Asp Ala lie Phe Gin Val Val Trp Ser Glu lie Phe Val Leu Arg 
145 150 155 160 

Ala Ser His Trp Ser Leu Asp lie Ser Ala Met lie Asp Gly Cys Gly 

165 170 175 

Asp Glu Gin Leu Lys Arg Leu lie Cys Glu Ala His Gin Leu Arg Ala 

180 185 190 

Asp Val Leu Glu Leu Asn Phe Met Glu Ser Leu lie Leu Cys Arg Lys 
195 200 205 

Glu Leu Ala lie Asn Ala Glu Tyr Ala Val lie Leu Gly Ser His Ser 
210 215 220 

Lys Ala Ala Leu He Ser Leu Ala Arg Tyr Thr Leu Gin Gin Ser Asn 
225 230 235 240 

Tyr Leu Arg Phe Gly Gin Leu Leu Leu Gly Leu Arg Gin Leu Cys Leu 

245 250 255 

Arg Arg Phe Asp Cys Ala Leu Ser Cys Met Phe Arg Ser Val Val Arg 

260 265 270 

Asp He Leu Lys Thr Leu 
275 



<210> 9 
<211> 1626 
<212> DNA 

<213> Drosophila melanogaster 
<400> 9 

atgcacggac aggcgcctcc acctacatca acgggcgtgg ccccgcccac acagccaccg 60 
ccccctcatc ccgccgcccc aaacgtgccc aatggtcgat tgctgagctg gaatcacagt 120 
gccgctgcag ctgctgcggc ggcggcagcc caagcggcag ccaactccat gaaccactcg 180 
tcggcggcgg agggttcatc gatgacccgg attaagggtc agaacctggg cctcatctgc 240 
gtggtgtgcg gcgacaccag ctcgggaaag cactacggaa tcctagcctg caatggctgc 3 00 
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tccggattct tcaaacgcag cgcgggaacg ggacgctgtg tggtggacaa agctcatcgg 3 60 

aatcaatgcc aggcctgcag gctcaagaag tgccttcaaa tgggaatgaa caaggacgcc 420 

gtgcaaaatg agcgacagcc ccgcaacacg gccactatac ggccggagac tctgcgggag 4 80 

atggagcacg gacgggcgct gcgagaggcc gccgtggccg tcggggtttt cgggccaccc 54 0 

gtgctactgt ctccgccctg ctacggctcc ggactcctgc cgccgccctc actgggcagt 600 

ttgccgacgg gacgattgct gcaccacaac cacctcacct cctcaatgca attggctgcg 660 

aaccacatgg gcgccggcag ctttcccatg ttcaacgcag ccggcgtgca ccactcgccc 720 

aaggaaaagg cctacggcat ggagatggcc accagtggca atgtctccca ctcgaccaac 780 

tcgagcagca accactccat cgacccgagc tcaccggcgc cggaaaacgc caaggagata 84 0 

aacatcgccg gcggcagtgt ctcctccgtc agctcttcca gtcccaccat ggaaaatgac 900 

aatgatgacg actccataga tgtaaccaac gacaacgagg agccgcatgc agtcagcaga 960 

tcggattcga gtttcattat gccgcagttc atgtcgccca atctgtacac ccatcaacac 1020 

gaaacagttt acgagacaag tgcccggctg ctcttcatgg ccgtcaagtg ggccaagaac 1080 

ctgcccagct ttgcaagact ttcctttcgg gatcaggtga ttttgctgga ggagtcctgg 1140 

tcggagctgt tcctgctgaa cgcaatccaa tggtgcattc ccctggatcc caccggctgc 12 00 

gccctcttct cggtggcgga gcactgcaat aatctagaga acaatgccaa tggcgacact 1260 

tgcataacaa aggaggagct ggcggcggat gtgcgaacgc tccacgagat cttctgcaaa 132 0 

tacaaggcgg tgctggtgga ccccgctgaa ttcgcgtgcc tcaaggcgat agttctcttc 1380 

cggccggaaa cgcgcggact taaagatccg gcgcagatag agaatcttca ggatcaggcg 1440 

cacgtaatgc tgtcgcagca cacaaagacg cagttcaccg cccagatagc cagattcgga 1500 

cgactccttc tcatgctgcc gttgctgcgc atgatcagct cccacaagat tgagtccatc 1560 

tattttcagc gcactattgg gaacacgccc atggaaaagg tgctctgtga catgtataag 1620 

aactag 1626 

<210> 10 
<211> 541 
<212> PRT 

<213> Drosophila melanogaster 
<400> 10 

Met His Gly Gin Ala Pro Pro Pro Thr Ser Thr Gly Val Ala Pro Pro 
15 10 15 

Thr Gin Pro Pro Pro Pro His Pro Ala Ala Pro Asn Val Pro Asn Gly 
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Arg Leu Leu Ser 
35 

Ala Ala Gin Ala 
50 

Gly Ser Ser Met 
65 

Val Val Cys Gly 



Cys Asn Gly Cys 

100 

Cys Val Val Asp 
115 

Lys Lys Cys Leu 
130 

Arg Gin Pro Arg 
145 

Met Glu His Gly 



Phe Gly Pro Pro 

180 

Leu Pro Pro Pro 
195 

His Asn His Leu 
210 

Ala Gly Ser Phe 
225 

Lys Glu Lys Ala 



His Ser Thr Asn 

260 

Ala Pro Glu Asn 
275 

Ser Val Ser Ser 
290 

Ser lie Asp Val 

305 

Ser Asp Ser Ser 



Thr His Gin His 

340 



Trp Asn His Ser 

40 

Ala Ala Asn Ser 
55 

Thr Arg lie Lys 
70 

Asp Thr Ser Ser 
85 

Ser Gly Phe Phe 



Lys Ala His Arg 

120 

Gin Met Gly Met 
135 

Asn Thr Ala Thr 
150 

Arg Ala Leu Arg 
165 

Val Leu Leu Ser 



Ser Leu Gly Ser 

200 

Thr Ser Ser Met 
215 

Pro Met Phe Asn 
230 

Tyr Gly Met Glu 
245 

Ser Ser Ser Asn 



Ala Lys Glu lie 

280 

Ser Ser Pro Thr 
295 

Thr Asn Asp Asn 

310 

Phe lie Met Pro 
325 

Glu Thr Val Tyr 



Ala Ala Ala Ala 



Met Asn His Ser 

60 

Gly Gin Asn Leu 
75 

Gly Lys His Tyr 
90 

Lys Arg Ser Ala 
105 

Asn Gin Cys Gin 



Asn Lys Asp Ala 

140 

lie Arg Pro Glu 
155 

Glu Ala Ala Val 
170 

Pro Pro Cys Tyr 
185 

Leu Pro Thr Gly 



Gin Leu Ala Ala 

220 

Ala Ala Gly Val 
235 

Met Ala Thr Ser 
250 

His Ser lie Asp 
265 

Asn lie Ala Gly 



Met Glu Asn Asp 

300 

Glu Glu Pro His 
315 

Gin Phe Met Ser 
330 

Glu Thr Ser Ala 
345 



Ala Ala Ala Ala 
45 

Ser Ala Ala Glu 



Gly Leu lie Cys 

80 

Gly lie Leu Ala 
95 

Gly Thr Gly Arg 
110 

Ala Cys Arg Leu 
125 

Val Gin Asn Glu 



Thr Leu Arg Glu 

160 

Ala Val Gly Val 
175 

Gly Ser Gly Leu 
190 

Arg Leu Leu His 
205 

Asn His Met Gly 



His His Ser Pro 

240 

Gly Asn Val Ser 
255 

Pro Ser Ser Pro 
270 

Gly Ser Val Ser 
285 

Asn Asp Asp Asp 



Ala Val Ser Arg 

320 

Pro Asn Leu Tyr 
335 

Arg Leu Leu Phe 
350 
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Met Ala Val Lys Trp Ala Lys Asn Leu Pro Ser Phe Ala Arg Leu Ser 
355 360 365 

Phe Arg Asp Gin Val lie Leu Leu Glu Glu Ser Trp Ser Glu Leu Phe 
370 375 380 

Leu Leu Asn Ala lie Gin Trp Cys lie Pro Leu Asp Pro Thr Gly Cys 
385 390 395 400 

Ala Leu Phe Ser Val Ala Glu His Cys Asn Asn Leu Glu Asn Asn Ala 

405 410 415 

Asn Gly Asp Thr Cys lie Thr Lys Glu Glu Leu Ala Ala Asp Val Arg 

420 425 430 

Thr Leu His Glu lie Phe Cys Lys Tyr Lys Ala Val Leu Val Asp Pro 
435 440 445 

Ala Glu Phe Ala Cys Leu Lys Ala lie Val Leu Phe Arg Pro Glu Thr 
450 455 460 

Arg Gly Leu Lys Asp Pro Ala Gin lie Glu Asn Leu Gin Asp Gin Ala 
465 470 475 480 

His Val Met Leu Ser Gin His Thr Lys Thr Gin Phe Thr Ala Gin He 

485 490 495 

Ala Arg Phe Gly Arg Leu Leu Leu Met Leu Pro Leu Leu Arg Met He 

500 505 510 

Ser Ser His Lys He Glu Ser He Tyr Phe Gin Arg Thr He Gly Asn 
515 520 525 

Thr Pro Met Glu Lys Val Leu Cys Asp Met Tyr Lys Asn 



<210> 11 
<211> 1599 
<212> DNA 

<213> Drosophila melanogaster 

<400> 11 

atggcgaccg ggcgttctct gctctttcga gtgccttggt atgtgtgctt gtgtgtgtgc 60 

gcagagagcg cagagccggg tgtttattgg agattgcgat tgcggcttgg cttacccaca 12 0 

ctcgcagggc cgcacaccaa cacactaaca ctaacagcga ggacaagctc ctgccgcagc 180 

atcaagaagg aacgaatcaa agcaagccaa caagcaaatg cgccaccaga gttgccacta 24 0 

aaagtctccg ttgacgttaa catcatcatc gcggcacact cgcagcgccg tcggatcgga 3 00 

ttggttcggt ttcatcagcg ggaatcagag gaccgtccac ttgccgtcgc ctctccacga 3 60 

ttgcaaatta atatggagcc tactgcgatg aacccgaaaa aactccacag tccgcagcgg 420 

cattgctaca ctccgccgcc ggcgccgatg cacggacagg cgcctccacc tacatcaacg 480 

ggcgtggccc cgcccacaca gccaccgccc cctcatcccg ccgccccaaa cgtgcccaat 540 

ggtcgattgc tgagctggaa tcacagtgcc gctgcagctg ctgcggcggc ggcagcccaa 600 
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gcggcagcca actccatgaa ccactcgtcg gcggcggagg gttcatcgat gacccggatt 660 

aagggtcaga acctgggcct catctgcgtg gtgtgcggcg acaccagctc gggaaagcac 720 

tacggaatcc tagcctgcaa tggctgctcc ggattcttca aacgcagcgt gcggcggaaa 780 

ctcatttatc gctgccaggc gggaacggga cgctgtgtgg tggacaaagc tcatcggaat 84 0 

caatgccagg cctgcaggct caagaagtgc cttcaaatgg gaatgaacaa ggacgacgac 900 

tccatagatg taaccaacga caacgaggag ccgcatgcag tcagcagatc ggattcgagt 960 

ttcattatgc cgcagttcat gtcgcccaat ctgtacaccc atcaacacga aacagtttac 1020 

gagacaagtg cccggctgct cttcatggcc gtcaagtggg ccaagaacct gcccagcttt 1080 

gcaagacttt cctttcggga tcaggtaatt ttgctggagg agtcctggtc ggagctgttc 1140 

ctgctgaacg caatccaatg gtgcattccc ctggatccca ccggctgcgc cctcttctcg 1200 

gtggcggagc actgcaataa tctagagaac aatgccaatg gcgacacttg cataacaaag 1260 

gaggagctgg cggcggatgt gcgaacgctc cacgagatct tctgcaaata caaggcggtg 13 20 

ctggtggacc ccgctgaatt cgcgtgcctc aaggcgatag ttctcttccg gccggaaacg 1380 

cgcggactta aagatccggc gcagatagag aatcttcagg atcaggcgca ccacacaaag 1440 

acgcagttca ccgcccagat agccagattc ggacgactcc ttctcatgct gccgttgctg 1500 

cgcatgatca gctcccacaa gattgagtcc atctattttc agcgcactat tgggaacacg 1560 

cccatggaaa aggtgctctg tgacatgtat aagaactag 1599 

<210> 12 
<211> 532 
<212> PRT 

<213> Drosophila melanogaster 
<400> 12 

Met Ala Thr Gly Arg Ser Leu Leu Phe Arg Val Pro Trp Tyr Val Cys 



Leu Cys Val Cys Ala Glu Ser Ala Glu Pro Gly Val Tyr Trp Arg Leu 

20 25 30 

Arg Leu Arg Leu Gly Leu Pro Thr Leu Ala Gly Pro His Thr Asn Thr 
35 40 45 

Leu Thr Leu Thr Ala Arg Thr Ser Ser Cys Arg Ser lie Lys Lys Glu 
50 55 60 

Arg lie Lys Ala Ser Gin Gin Ala Asn Ala Pro Pro Glu Leu Pro Leu 
65 70 75 80 

Lys Val Ser Val Asp Val Asn He He He Ala Ala His Ser Gin Arg 

85 90 95 

15/97 



WO 02/077157 



PCT/US02/11257 



Arg Arg lie Gly 

100 

Pro Leu Ala Val 
115 

Ala Met Asn Pro 
130 

Pro Pro Pro Ala 
145 

Gly Val Ala Pro 



Asn Val Pro Asn 

180 

Ala Ala Ala Ala 
195 

Ser Ser Ala Ala 
210 

Leu Gly Leu lie 
225 

Tyr Gly lie Leu 



Val Arg Arg Lys 

260 

Val Val Asp Lys 
275 

Lys Cys Leu Gin 
290 

Thr Asn Asp Asn 
305 

Phe lie Met Pro 



Glu Thr Val Tyr 

340 

Trp Ala Lys Asn 
355 

Val lie Leu Leu 
370 

lie Gin Trp Cys 
385 

Val Ala Glu His 



Leu Val Arg Phe 



Ala Ser Pro Arg 

120 

Lys Lys Leu His 
135 

Pro Met His Gly 
150 

Pro Thr Gin Pro 
165 

Gly Arg Leu Leu 



Ala Ala Ala Gin 

200 

Glu Gly Ser Ser 
215 

Cys Val Val Cys 
230 

Ala Cys Asn Gly 
245 

Leu lie Tyr Arg 



Ala His Arg Asn 

280 

Met Gly Met Asn 
295 

Glu Glu Pro His 
310 

Gin Phe Met Ser 
325 

Glu Thr Ser Ala 



Leu Pro Ser Phe 

360 

Glu Glu Ser Trp 
375 

lie Pro Leu Asp 
390 

Cys Asn Asn Leu 
405 



His Gin Arg Glu 
105 

Leu Gin lie Asn 



Ser Pro Gin Arg 

140 

Gin Ala Pro Pro 
155 

Pro Pro Pro His 
170 

Ser Trp Asn His 
185 

Ala Ala Ala Asn 



Met Thr Arg lie 

220 

Gly Asp Thr Ser 
235 

Cys Ser Gly Phe 
250 

Cys Gin Ala Gly 
265 

Gin Cys Gin Ala 



Lys Asp Asp Asp 

300 

Ala Val Ser Arg 
315 

Pro Asn Leu Tyr 
330 

Arg Leu Leu Phe 
345 

Ala Arg Leu Ser 



Ser Glu Leu Phe 

380 

Pro Thr Gly Cys 
395 

Glu Asn Asn Ala 
410 



Ser Glu Asp Arg 
110 

Met Glu Pro Thr 
125 

His Cys Tyr Thr 



Pro Thr Ser Thr 

160 

Pro Ala Ala Pro 
175 

Ser Ala Ala Ala 
190 

Ser Met Asn His 
205 

Lys Gly Gin Asn 



Ser Gly Lys His 

240 

Phe Lys Arg Ser 
255 

Thr Gly Arg Cys 
270 

Cys Arg Leu Lys 
285 

Ser lie Asp Val 



Ser Asp Ser Ser 

320 

Thr His Gin His 
335 

Met Ala Val Lys 
350 

Phe Arg Asp Gin 
365 

Leu Leu Asn Ala 



Ala Leu Phe Ser 

400 

Asn Gly Asp Thr 
415 
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Cys lie Thr Lys Glu Glu Leu Ala Ala Asp Val Arg Thr Leu His Glu 

420 425 430 

lie Phe Cys Lys Tyr Lys Ala Val Leu Val Asp Pro Ala Glu Phe Ala 
435 440 445 

Cys Leu Lys Ala lie Val Leu Phe Arg Pro Glu Thr Arg Gly Leu Lys 
450 455 460 

Asp Pro Ala Gin lie Glu Asn Leu Gin Asp Gin Ala His His Thr Lys 
465 470 475 4B0 

Thr Gin Phe Thr Ala Gin lie Ala Arg Phe Gly Arg Leu Leu Leu Met 

485 490 495 

Leu Pro Leu Leu Arg Met lie Ser Ser His Lys lie Glu Ser He Tyr 

500 505 510 

Phe Gin Arg Thr He Gly Asn Thr Pro Met Glu Lys Val Leu Cys Asp 



Met Tyr Lys Asn 
530 

<210> 13 
<211> 4563 
<212> DNA 

<213> Drosophila melanogaster 
<400> 13 

atgacactga gccgtggccc gtacagcgag ctcgataaaa tgagcctttt tcaagacctc 60 

aaactcaaac ggcgcaaaat cgattcgcga tgcagcagtg acggcgagtc catagcggac 120 

acgtccacct cgtcgccgga cctgctggcg cccatgtcgc cgaagctctg cgacagcggc 180 

tcggcggggg cgtcgctggg ggcatcgctg cccctgccgc tggccctgcc cctgccaatg 240 

gccctgccac tgcccatgtc gctgcccctg cccctcacgg cggcatcttc ggcggtcacc 300 

gtttcgctgg cagcggtcgt ggccgcggtg gccgagacgg gtggcgcggg cgcgggagga 360 

gctgggacag cagtaacagc gtcgggagca ggaccatgcg tctccacgtc gtctacgacg 420 

gcagcggcag ccacatcctc gacctcctcg ctctcgtcct cctcctcttc gtcatcctcc 480 

acgtcctcca gcacttcctc cgcctcgccg acagctggag cctcctccac ggccacctgc 540 

cccgccagca gcagcagcag cagtggaaac ggaagtgggg gcaaaagtgg tagcatcaag 6 00 

caggagcaca cggagataca ctcgtcgagc agtgcgattt cggcggccgc cgcctcaacg 660 

gtgatgtcac cgccgcccgc tgaggcgacg agatccagtc cagccacgcc cgagggaggc 720 

ggaccagctg gcgacggaag tggagcaacg ggaggcggaa acacgagcgg cggatcaacg 7 80 

gctggagtgg ccattaatga acaccaaaac aatggcaatg gcagcggcgg gagcagtcga 840 

gcctctcccg attcgctgga agagaagccc tctaccacaa cgaccacagg tcgtccaacg 900 

ctcacgccca cgaatggggt gctgtcctcc gcctcggcgg gcacggggat ttccacagga 960 
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agcagcgcca agctgagcga ggctggtatg agtgtgatac ggtccgtgaa ggaggagcgc 102 0 

ttgctcaacg tatccagcaa gatgctggtg ttccatcagc agcgggagca agagaccaaa 1080 

gcagtggcgg ctgcagcagc agcagcagcg gcgggccatg tgacggttct agtgacgcca 1140 

tcgcgcatca aatcggagcc accgccgccg gcttcaccct cctctacatc cagcacacaa 1200 

agggaaaggg aacgggaacg cgatcgagag agggatcgcg aaagggaacg cgagcgggac 12 60 

cgggaccggg aacgggaacg ggaacagtcc atcagctcct cgcagcagca cctaagtcgg 1320 

gtctccgcca gtccacccac tcagctgtcc cacggcagcc tgggacccaa cattgtgcag 1380 

acgcaccatc ttcaccagca actcacacag ccgctgacgc tgcgcaagag cagcccgccc 1440 

acagagcacc tgctcagtca gtccatgcaa catctcacac agcagcaggc gatccacctg 1500 

catcacctac ttggccagca gcagcagcag cagcaggcgt cgcatcccca gcagcaacag 1560 

cagcagcaac actcgcccca ctccctggtg cgggtgaaaa aggaaccgaa tgttggtcag 1620 

cggcacttat cgccgcatca ccaacaacag tcgccactcc tgcagcacca ccaacagcag 1680 

cagcagcagc aacaacaaca gcaacagcat ctgcatcagc aacagcaaca gcagcagcat 1740 

caccagcagc agccccaggc actggccctg atgcatccgg cttccctggc gctaaggaac 1800 

agcaatcggg atgcggccat tctgtttcgg gtgaagagcg aagtgcacca gcaggtggcc 1860 

gccgggctgc cgcatctgat gcagtccgct ggtggggcag cggccgccgc cgcagcagct 1920 

gtggccgctc agcgaatggt atgcttcagc aatgccagga tcaatggcgt taagccggag 1980 

gtgattggag gaccgctggg caacctgcgg cccgtgggcg tcggtggcgg aaacggaagt 2040 

ggctccgtgc agtgcccctc gccgcatcca tcctcctcgt cgtcatcctc gcagctgtcg 2100 

ccgcagacgc cctcccagac gccgccccga ggcacgccca ccgtcataat gggcgagagc 2160 

tgcggggtgc gcaccatggt ctggggctac gagcctccgc caccctcggc gggccagtcc 2220 

cacggccagc acccgcaaca gcaacagcag tcgccccacc accagccgca acaacaacag 2280 

cagcagcaac aacagcagtc gcagcagcaa cagcaacagc agcagcaaca gtcgctgggc 2 34 0 

cagcagcagc actgcctctc ctcgccgtcg gcgggatcgc tgacgccctc ctcttcgtcc 24 00 

ggcggtggtt cggtatctgg cggcggagtg ggcggaccac tcacaccctc ctcggtggcg 24 60 

ccgcagaata acgaggaggc cgcccaactc ctgcttctcc ctgggacaga cacgcatcca 2520 

ggacatgaga tcacgggcca cacccccttc cgcacaccgc acgcccttaa tatggagcgg 2580 

ctgtgggcgg gagactactc gcaattgccg cccggccagc tgcaggctct gaatctcagt 2640 

gcccaacagc agcagtgggg cagcagcaac tccacgggtc ttggtggcgt aggcggcggc 2700 

atgggcggac gcaacctgga ggcgccgcac gagccgaccg acgaggacga acagccgctc 2 760 
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gtttgcatga tctgcgagga caaggccacc 
gggtgcaagg gctccttcaa gcggacggtg 
5 gacggcacct gcgagataac caaagcacag 
aagtgcatcg agcagggcat ggtgctgcaa 
cgcaacagtg gcgccgtcta caatttgtac 

10 

aatcagaagc agcagcagca ggccgcccag 
cagcaccagc aacagcagca gcatcaacag 
15 tcgccgctcc accatcacca ccaccagggc 
cacccacagc tgtcgccgca ccacctgctg 
gtggcagcag ctgcgcagca ccaacagcaa 

20 

gccaagctga tgggcggcgt ggtggacatg 
ccggagttgc tgcaagcacc ccccatgcac 
25 cagcagcagc agcaacagca ggcctcgccg 
cagcagcagc agcagggaca gcaccaaaac 
ggagctggtg gaggagctca actgccgccg 

30 

gccctaacca atcccagcga gattgtacat 
tcgtccaagg accgacagat ctcgtacgag 
35 gactgcgacg cgatggagga catagccaca 
aagtcggaga ttagcgagaa actgtgcaac 
tcgtggacaa aaaagttgcc cttctacctg 

40 

ctgacggaca agtggcacga gatccttatc 
ggcaagcggc gtggcgaggg aggaggcagc 
45 agcacgccca ctggtacgcc gttgagcaca 
aaggacgacc cggagtttgt cagcgaggtg 
ttgaccacgc taatgggcca gccgatagcg 

50 

atggtggaca agatgaccca gatcaccatc 
gagtacgtct gcctgaaggt ttacatactg 
55 atccaggagc ggtacgtcca ggtgctgcgc 
ccgcaggcga ggctcagtga actgctctcc 
ctgctgctcg agagcaagat gttctatgtg 

60 

tag 
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ggcctgcact acggcatcat cacctgcgag 2820 

cagaaccgac gagtctacac ctgcgtggcg 2880 

cgcaaccgtt gtcagtattg tcgatttaag 2940 

gccgttcgcg aggatcgcat gccgggcggt 3 0 00 

aaggtgaagt acaagaagca caagaagacc 3 060 

cagcagcagc agcaggcggc ggcgcagcag 3120 

caccagcaac atcagcaaca gcagttgcac 3180 

caccagtcgc accacgcgca gcagcagcac 3240 

tcgccgcagc agcagcaact tgccgccgcg 33 00 

cagcaacaac agcagcaaca gcagcagcag 33 60 

aagcccatgt tcctcggccc cgctttgaag 3420 

agtccggccc agcaacaaca acagcagcag 3480 

catctctcgc ttagctcacc gcaccagcag 3540 

caccaccagc aacaaggtgg gggtggcgga 3 600 

cacctggtga acggaacgat actgaagacg 3660 

ctgcgccacc gcctcgactc ggcggtcagt 3720 

cacgccttag gcatgatcca gacactgatc 3780 

ctgccgcact tcagcgagtt ccttgaggac 3840 

atcggcgatt ccatagtcca caagctggtg 3900 

gagatcccgg tggagataca taccaaacta 3 960 

ctgaccacgg ccgcctacca ggcgttgcat 4020 

aggcatggtt cgccggcgtc aacgccactg 4 080 

ccgataccct cgcccgccca gccactgcac 4140 

aactcgcacc tgagcacact gcaaacctgc 4200 

atggagcagc tgaagctgga cgtcgggcac 4 2 60 

atgttccggc gaatcaagct caagatggag 4320 

ctaaacaaag cagaagtgga actggagagc 43 80 

tcctacctgc aaaactcctc gccgcagaat 4440 

cacataccag agatccaggc tgcggctagc 4500 

cccttcgtgc tcaactcggc gagcataagg 4560 

4563 
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<210> 14 
<211> 1520 
<212> PRT 

<213> Drosophila melanogaster 
<400> 14 

Met Thr Leu Ser Arg Gly Pro Tyr Ser Glu Leu Asp Lys Met Ser Leu 
15 10 15 

Phe Gin Asp Leu Lys Leu Lys Arg Arg Lys lie Asp Ser Arg Cys Ser 

20 25 30 

Ser Asp Gly Glu Ser He Ala Asp Thr Ser Thr Ser Ser Pro Asp Leu 
35 40 45 

Leu Ala Pro Met Ser Pro Lys Leu Cys Asp Ser Gly Ser Ala Gly Ala 
50 55 60 

Ser Leu Gly Ala Ser Leu Pro Leu Pro Leu Ala Leu Pro Leu Pro Met 
65 70 75 80 

Ala Leu Pro Leu Pro Met Ser Leu Pro Leu Pro Leu Thr Ala Ala Ser 

85 90 95 

Ser Ala Val Thr Val Ser Leu Ala Ala Val Val Ala Ala Val Ala Glu 

100 105 110 

Thr Gly Gly Ala Gly Ala Gly Gly Ala Gly Thr Ala Val Thr Ala Ser 
115 120 125 

Gly Ala Gly Pro Cys Val Ser Thr Ser Ser Thr Thr Ala Ala Ala Ala 
130 135 140 

Thr Ser Ser Thr Ser Ser Leu Ser Ser Ser Ser Ser Ser Ser Ser Ser 
145 150 155 160 

Thr Ser Ser Ser Thr Ser Ser Ala Ser Pro Thr Ala Gly Ala Ser Ser 

165 170 175 

Thr Ala Thr Cys Pro Ala Ser Ser Ser Ser Ser Ser Gly Asn Gly Ser 

180 185 190 

Gly Gly Lys Ser Gly Ser He Lys Gin Glu His Thr Glu He His Ser 
195 200 205 

Ser Ser Ser Ala He Ser Ala Ala Ala Ala Ser Thr Val Met Ser Pro 
210 215 220 

Pro Pro Ala Glu Ala Thr Arg Ser Ser Pro Ala Thr Pro Glu Gly Gly 
225 230 235 240 

Gly Pro Ala Gly Asp Gly Ser Gly Ala Thr Gly Gly Gly Asn Thr Ser 

245 250 255 

Gly Gly Ser Thr Ala Gly Val Ala He Asn Glu His Gin Asn Asn Gly 

260 265 270 

Asn Gly Ser Gly Gly Ser Ser Arg Ala Ser Pro Asp Ser Leu Glu Glu 
275 280 285 
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Lys Pro Ser Thr 
290 

Asn Gly Val Leu 
305 

Ser Ser Ala Lys 



Lys Glu Glu Arg 

340 

Gin Gin Arg Glu 
355 

Ala Ala Ala Gly 
370 

Ser Glu Pro Pro 
385 

Arg Glu Arg Glu 



Arg Glu Arg Asp 

420 

Ser Ser Gin Gin 
435 

Leu Ser His Gly 
450 

His Gin Gin Leu 
465 

Thr Glu His Leu 



Ala lie His Leu 

500 

Ala Ser His Pro 
515 

Leu Val Arg Val 
530 

Pro His His Gin 
545 

Gin Gin Gin Gin 

0 

Gin Gin Gin His 

580 

Pro Ala Ser Leu 
595 



Thr Thr Thr Thr 

295 

Ser Ser Ala Ser 
310 

Leu Ser Glu Ala 
325 

Leu Leu Asn Val 



Gin Glu Thr Lys 

360 

His Val Thr Val 
375 

Pro Pro Ala Ser 
390 

Arg Glu Arg Asp 
405 

Arg Asp Arg Glu 



His Leu Ser Arg 

440 

Ser Leu Gly Pro 
455 

Thr Gin Pro Leu 
470 

Leu Ser Gin Ser 
485 

His His Leu Leu 



Gin Gin Gin Gin 

520 

Lys Lys Glu Pro 
535 

Gin Gin Ser Pro 
550 

Gin Gin Gin Gin 
565 

His Gin Gin Gin 



Ala Leu Arg Asn 

600 



Gly Arg Pro Thr 

300 

Ala Gly Thr Gly 
315 

Gly Met Ser Val 
330 

Ser Ser Lys Met 
345 

Ala Val Ala Ala 



Leu Val Thr Pro 

380 

Pro Ser Ser Thr 
395 

Arg Glu Arg Asp 
410 

Arg Glu Arg Glu 
425 

Val Ser Ala Ser 



Asn lie Val Gin 

460 

Thr Leu Arg Lys 
475 

Met Gin His Leu 
490 

Gly Gin Gin Gin 
505 

Gin Gin Gin His 



Asn Val Gly Gin 

540 

Leu Leu Gin His 
555 

Gin His Leu His 
570 

Pro Gin Ala Leu 
585 

Ser Asn Arg Asp 



Leu Thr Pro Thr 



lie Ser Thr Gly 

320 

lie Arg Ser Val 
335 

Leu Val Phe His 
350 

Ala Ala Ala Ala 

365 

Ser Arg lie Lys 



Ser Ser Thr Gin 

400 

Arg Glu Arg Glu 
415 

Gin Ser lie Ser 
430 

Pro Pro Thr Gin 
445 

Thr His His Leu 



Ser Ser Pro Pro 

480 

Thr Gin Gin Gin 
495 

Gin Gin Gin Gin 
510 

Ser Pro His Ser 
525 

Arg His Leu Ser 



His Gin Gin Gin 

560 

Gin Gin Gin Gin 
575 

Ala Leu Met His 
590 

Ala Ala lie Leu 
605 
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Phe Arg Val 
610 

His Leu Met 
5 625 

Val Ala Ala 



10 Val Lys Pro 



Gly Val Gly 
675 

15 

His Pro Ser 
690 

Ser Gin Thr 
20 705 

Cys Gly Val 



25 Ala Gly Gin 



His His Gin 
755 

30 

Gin Gin Gin 
770 

Cys Leu Ser 
35 785 

Gly Gly Gly 



40 Ser Ser Val 



Leu Pro Gly 
835 

45 

Pro Phe Arg 
850 

Asp Tyr Ser 
50 865 

Ala Gin Gin 



55 Val Gly Gly 



Thr Asp Glu 
915 

60 



Lys Ser Glu Val 

615 

Gin Ser Ala Gly 
630 

Gin Arg Met Val 
645 

Glu Val He Gly 
660 

Gly Gly Asn Gly 



Ser Ser Ser Ser 

695 

Pro Pro Arg Gly 
710 

Arg Thr Met Val 
725 

Ser His Gly Gin 
740 

Pro Gin Gin Gin 



Gin Gin Gin Gin 

775 

Ser Pro Ser Ala 
790 

Ser Val Ser Gly 
805 

Ala Pro Gin Asn 
820 

Thr Asp Thr His 



Thr Pro His Ala 

855 

Gin Leu Pro Pro 
870 

Gin Gin Trp Gly 
885 

Gly Met Gly Gly 
900 

Asp Glu Gin Pro 



His Gin Gin Val Ala 

620 

Gly Ala Ala Ala Ala 

635 

Cys Phe Ser Asn Ala 
650 

Gly Pro Leu Gly Asn 
665 

Ser Gly Ser Val Gin 
680 

Ser Ser Gin Leu Ser 

700 

Thr Pro Thr Val He 

715 

Trp Gly Tyr Glu Pro 
730 

His Pro Gin Gin Gin 
745 

Gin Gin Gin Gin Gin 
760 

Gin Gin Ser Leu Gly 

780 

Gly Ser Leu Thr Pro 

795 

Gly Gly Val Gly Gly 
810 

Asn Glu Glu Ala Ala 
825 

Pro Gly His Glu He 
840 

Leu Asn Met Glu Arg 

860 

Gly Gin Leu Gin Ala 

875 

Ser Ser Asn Ser Thr 
890 

Arg Asn Leu Glu Ala 
905 

Leu Val Cys Met He 
920 



Ala Gly Leu Pro 



Ala Ala Ala Ala 

640 

Arg He Asn Gly 
655 

Leu Arg Pro Val 
670 

Cys Pro Ser Pro 
685 

Pro Gin Thr Pro 



Met Gly Glu Ser 

720 

Pro Pro Pro Ser 
735 

Gin Gin Ser Pro 
750 

Gin Gin Ser Gin 
765 

Gin Gin Gin His 



Ser Ser Ser Ser 

800 

Pro Leu Thr Pro 
815 

Gin Leu Leu Leu 
830 

Thr Gly His Thr 
845 

Leu Trp Ala Gly 



Leu Asn Leu Ser 

880 

Gly Leu Gly Gly 
895 

Pro His Glu Pro 
910 

Cys Glu Asp Lys 
925 
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Ala Thr Gly Leu His Tyr Gly lie lie Thr Cys Glu Gly Cys Lys Gly 
930 935 940 

Phe Phe Lys Arg Thr Val Gin Asn Arg Arg Val Tyr Thr Cys Val Ala 
945 950 955 960 

Asp Gly Thr Cys Glu lie Thr Lys Ala Gin Arg Asn Arg Cys Gin Tyr 

965 970 975 

Cys Arg Phe Lys Lys Cys lie Glu Gin Gly Met Val Leu Gin Ala Val 

980 985 990 

Arg Glu Asp Arg Met Pro Gly Gly Arg Asn Ser Gly Ala Val Tyr Asn 
995 1000 1005 

Leu Tyr Lys Val Lys Tyr Lys Lys His Lys Lys Thr Asn Gin Lys 
1010 1015 1020 

Gin Gin Gin Gin Ala Ala Gin Gin Gin Gin Gin Gin Ala Ala Ala 
1025 1030 1035 

Gin Gin Gin His Gin Gin Gin Gin Gin His Gin Gin His Gin Gin 
1040 1045 1050 

His Gin Gin Gin Gin Leu His Ser Pro Leu His His His His His 
1055 1060 1065 

Gin Gly His Gin Ser His His Ala Gin Gin Gin His His Pro Gin 
1070 1075 1080 

Leu Ser Pro His His Leu Leu Ser Pro Gin Gin Gin Gin Leu Ala 
1085 1090 1095 

Ala Ala Val Ala Ala Ala Ala Gin His Gin Gin Gin Gin Gin Gin 
1100 1105 1110 

Gin Gin Gin Gin Gin Gin Gin Ala Lys Leu Met Gly Gly Val Val 
1115 1120 1125 

Asp Met Lys Pro Met Phe Leu Gly Pro Ala Leu Lys Pro Glu Leu 
1130 1135 1140 

Leu Gin Ala Pro Pro Met His Ser Pro Ala Gin Gin Gin Gin Gin 
1145 1150 1155 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Ala Ser Pro His Leu Ser 
1160 1165 1170 

Leu Ser Ser Pro His Gin Gin Gin Gin Gin Gin Gin Gly Gin His 
1175 1180 1185 

Gin Asn His His Gin Gin Gin Gly Gly Gly Gly Gly Gly Ala Gly 
1190 1195 1200 

Gly Gly Ala Gin Leu Pro Pro His Leu Val Asn Gly Thr lie Leu 
1205 1210 1215 

Lys Thr Ala Leu Thr Asn Pro Ser Glu lie Val His Leu Arg His 
1220 1225 1230 
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Arg Leu Asp 
1235 

Tyr Glu His 
1250 

Ala Met Glu 
1265 

Glu Asp Lys 
1280 

Ser lie Val 
1295 

Tyr Leu Glu 
1310 

Lys Trp His 
1325 

Leu His Gly 
1340 

Ser Pro Ala 
1355 

Ser Thr Pro 
1370 

Pro Glu Phe 
1385 

Thr Cys Leu 
1400 

Leu Lys Leu 
1415 

Thr He Met 
1430 

Cys Leu Lys 
1445 

Glu Ser He 
1460 

Gin Asn Ser 
1475 

Leu Ser His 

1490 

Glu Ser Lys 
1505 

He Arg 
1520 



Ser Ala Val 
Ala Leu Gly 
Asp He Ala 
Ser Glu lie 
His Lys Leu 
He Pro Val 
Glu He Leu 
Lys Arg Arg 
Ser Thr Pro 
He Pro Ser 
Val Ser Glu 
Thr Thr Leu 
Asp Val Gly 
Phe Arg Arg 
Val Tyr He 
Gin Glu Arg 
Ser Pro Gin 
He Pro Glu 
Met Phe Tyr 



Ser Ser Ser 
1240 

Met He Gin 
1255 

Thr Leu Pro 
1270 

Ser Glu Lys 
1285 

Val Ser Trp 
1300 

4 

Glu He His 
1315 

He Leu Thr 
1330 

Gly Glu Gly 
1345 

Leu Ser Thr 
1360 

Pro Ala Gin 
1375 

Val Asn Ser 
1390 

Met Gly Gin 
1405 

His Met Val 
1420 

He Lys Leu 
1435 

Leu Leu Asn 
1450 

Tyr Val Gin 
1465 

Asn Pro Gin 
1480 

He Gin Ala 
1495 

Val Pro Phe 
1510 



Lys Asp Arg 
1245 

Thr Leu He 
1260 

His Phe Ser 
1275 

Leu Cys Asn 
1290 

Thr Lys Lys 
1305 

Thr Lys Leu 
1320 

Thr Ala Ala 
1335 

Gly Gly Ser 
1350 

Pro Thr Gly 
1365 

Pro Leu His 
1380 

His Leu Ser 
1395 

Pro He Ala 
1410 

Asp Lys Met 
1425 

Lys Met Glu 
1440 

Lys Ala Glu 
1455 

Val Leu Arg 
1470 

Ala Arg Leu 
14 8 5 

Ala Ala Ser 
1500 

Val Leu Asn 
1515 



Gin He Ser 
Asp Cys Asp 
Glu Phe Leu 
He Gly Asp 
Leu Pro Phe 
Leu Thr Asp 
Tyr Gin Ala 
Arg His Gly 
Thr Pro Leu 
Lys Asp Asp 
Thr Leu Gin 
Met Glu Gin 
Thr Gin He 
Glu Tyr Val 
Val Glu Leu 
Ser Tyr Leu 
Ser Glu Leu 
Leu Leu Leu 
Ser Ala Ser 
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<210> 15 
<211> 693 
<212> PRT 

<213> Drosophila melanogaster 
<400> 15 

Met Gly Thr Ala Gly Asp Arg Leu Leu Asp lie Pro Cys Lys Val Cys 
15 10 15 

Gly Asp Arg Ser Ser Gly Lys His Tyr Gly lie Tyr Ser Cys Asp Gly 

20 25 30 

Cys Ser Gly Phe Phe Lys Arg Ser lie His Arg Asn Arg lie Tyr Thr 
35 40 45 

Cys Lys Ala Thr Gly Asp Leu Lys Gly Arg Cys Pro Val Asp Lys Thr 
50 55 60 

His Arg Asn Gin Cys Arg Ala Cys Arg Leu Ala Lys Cys Phe Gin Ser 
65 70 75 80 

Ala Met Asn Lys Asp Ala Val Gin His Glu Arg Gly Pro Arg Lys Pro 

85 90 95 

Lys Leu His Pro Gin Leu His His His His His His Ala Ala Ala Ala 

100 105 110 

Ala Ala Ala Ala His His Ala Ala Ala Ala His His His His His His 
115 120 125 

His His His Ala His Ala Ala Ala Ala His His Ala Ala Val Ala Ala 
130 135 140 

Ala Ala Ala Ser Gly Leu His His His His His Ala Met Pro Val Ser 
145 150 155 160 

Leu Val Thr Asn Val Ser Ala Ser Phe Asn Tyr Thr Gin His lie Ser 

165 170 175 

Thr His Pro Pro Ala Pro Ala Ala Pro Pro Ser Gly Phe His Leu Thr 

180 185 190 

Ala Ser Gly Ala Gin Gin Gly Pro Ala Pro Pro Ala Gly His Leu His 
195 200 205 

His Gly Gly Ala Gly His Gin His Ala Thr Ala Phe His His Pro Gly 
210 215 220 

His Gly His Ala Leu Pro Ala Pro His Gly Gly Val lie Ser Asn Pro 
225 230 235 240 

Gly Gly Asn Ser Ser Ala lie Ser Gly Ser Gly Pro Gly Ser Thr Leu 

245 250 255 

Pro Phe Pro Ser His Leu Leu His His Asn Leu lie Ala Glu Ala Ala 

260 265 270 

Ser Lys Leu Pro Gly lie Thr Ala Thr Ala Val Ala Ala Val Val Ser 
275 280 285 
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Ser Thr Ser Thr 
290 

Ser Asn Asn His 
305 

Ser lie Ser Ser 



Ser Leu Gly Ser 

340 

Ser Pro Ser Asn 
355 

Pro Thr Leu Thr 
370 

Arg His Ser Leu 
385 

Met lie Cys Ala 



Asn Asn Asn Gly 

420 

Thr Pro Thr Thr 
435 

Thr Cys Asn Thr 
450 

Ser Pro Asp Lys 
465 

Thr Leu Leu Phe 



Glu Met Leu Gin 

500 

Trp Val Lys Cys 
515 

His Leu Leu Leu 
530 

Ala Gin Trp Thr 
545 

Leu lie Arg Glu 



Lys Thr lie Gin 

580 

Gly Ser Glu Val 
595 



Pro Tyr Ala Ser 
295 

Asn Tyr Ser Ser 
310 

lie Gly Ser Arg 
325 

Glu Ser Pro Arg 



Ser Pro Pro Leu 

360 

Thr Ser Ser Gly 
375 

Ser Glu Ala Thr 
390 

Ser Asn Asn Asn 
405 

Glu His Lys Gin 



Pro Thr Pro Pro 

440 

Ala Ser Ser Ser 
455 

Cys Gin Glu Leu 
470 

Pro Gin Gin Leu 
485 

Glu Thr Thr Ala 



Leu Met Pro Phe 

520 

Gin Glu Ser Trp 
535 

lie Pro Leu Asp 
550 

Arg Val Leu Gin 

565 

Glu lie Leu Cys 



Gly Cys Met Lys 

600 



Ala Ala Gin Ala 

300 

Pro Ser Pro Ser 
315 

Ser Gly Gly Gly 
330 

Val Asn Val Glu 
345 

Ser Ala Gly Ser 



Ser Pro Gin His 

380 

Thr Pro Pro Ser 
395 

Asn Asn Asn Asn 
410 

Ser Ser Tyr Thr 
425 

Pro Pro Arg Ser 



Ser Gly Phe Leu 

460 

lie Gin Tyr Gin 
475 

Leu Asp Ser Arg 
490 

Arg Leu Leu Phe 
505 

Gin Thr Leu Ser 



Lys Glu Leu Phe 

540 

Leu Thr Pro lie 
555 

Asp Glu Ala Thr 
570 

Arg Phe Arg Gin 
585 

Ala lie Ala Leu 



Ser Ser Pro Ser 



Asn Ser lie Gin 

320 

Glu Glu Gly Leu 
335 

Thr Glu Thr Pro 
350 

lie Ser Pro Ala 
365 

Arg Gin Met Ser 



His Ala Ser Leu 

400 

Asn Asn Asn Asn 
415 

Ser Gly Ser Pro 
430 

Gly Val Gly Ser 
445 

Glu Leu Leu Leu 



Val Gin His Asn 

480 

Leu Leu Ser Trp 
495 

Met Ala Val Arg 
510 

Lys Asn Asp Gin 
525 

Leu Leu Asn Leu 



Leu Glu Ser Pro 

560 

Gin Thr Glu Met 
575 

lie Thr Pro Asp 
590 

Phe Ala Pro Glu 
605 
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Thr Ala Gly Leu Cys Asp Val Gin Pro Val Glu Met Leu Gin Asp Gin 
610 615 620 

Ala Gin Cys lie Leu Ser Asp His Val Arg Leu Arg Tyr Pro Arg Gin 

625 630 635 640 

Ala Thr Arg Phe Gly Arg Leu Leu Leu Leu Leu Pro Ser Leu Arg Thr 

645 650 655 

lie Arg Ala Ala Thr He Glu Ala Leu Phe Phe Lys Glu Thr He Gly 

660 665 670 

Asn Val Pro He Ala Arg Leu Leu Arg Asp Met Tyr Thr Met Glu Pro 



Ala Gin Val Asp Lys 
690 

<210> 16 
<211> 3049 
<212> DNA 

<213> Drosophila melanogaster 
<400> 16 

gtcagcccag gcgatccgca tttgcgtccg cagcaggttt ccgatttcag aactctgatt 60 

ccagcggcag cgaatcgcgt cggcatctga acatttgaaa ataatctaaa attgcaagtg 120 

actttgtgca ccggttacac taaaattgtt aacaaatcgc catatattct gaatttaaat 180 

ttaaagtgcg cagtgcggaa tataaatcgg agcaaactgg atacgttagg gttcaaatac 240 

ttccatcaac ggaaaatggg cacagcgggc gatcgcctgt tggacattcc ctgcaaggtg 300 

tgtggcgatc gcagctccgg caagcactat ggaatctaca gctgcgatgg ctgctccggt 360 

tttttcaagc ggagcattca tcgcaatcgg atttacacct gtaaggccac cggcgatctc 420 

aagggtcgct gtccggtgga caagacccat cggaatcagt gtcgcgcctg tcgcctggcc 480 

aagtgcttcc agtcggccat gaacaaggat gctgtgcagc acgagcgcgg tcctaggaaa 54 0 

cccaagttgc acccgcaact gcatcatcat catcatcatg ctgctgccgc cgccgctgca 600 

gcgcatcatg cagcagccgc ccatcaccat caccatcatc accaccacgc ccacgcagcg 660 

gccgcccatc atgcggcagt ggctgcagcg gctgcctccg ggctgcatca ccaccaccac 720 

gccatgcccg tctcgctggt gaccaatgtc tcggcctcgt tcaactatac gcagcacatc 780 

tccacgcatc cgcctgctcc ggcggcgcca cccagtggct ttcacctgac ggccagtggc 840 

gcccagcagg gaccagctcc accagctggc cacctgcacc atggtggagc cggacatcag 900 

cacgccacgg ccttccacca tccgggacat ggacacgcgt tgcctgcccc acatggcggc 960 

gtgatcagca atcccggcgg caactcgagc gcaatctccg gcagcggacc cggctccacg 1020 

ctgcccttcc cctcgcacct gctgcaccac aatctgatag cggaggcggc cagcaagctg 1080 

ccgggcatca ctgccacagc cgttgcggcg gtggtgtcct ccactagcac gccctacgcc 1140 
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tcggcggccc 


aggcgtcgtc 


gcctagtagc 


aacaaccaca 


actactcctc 


gccctcgccc 


5 


agcaactcca 


tccagtccat 


ctcgagcatt 


ggatcgcgca 


gcggtggtgg 


cgaggagggc 


ctcagcctgg 


gcagcgagag 


tccgcgcgtc 


aatgtggaaa 


ctgagacacc 


ttcgccctca 




aactcgccgc 


ccctgagtgc 


tggtagcatt 


tcgccagcgc 


ccacgttgac 


cacctcgtcg 


10 


ggatcgccgc 


agcaccgcca 


gatgtcgcgg 


cacagcctca 


gtgaggcaac 


cacgccgccc 




agccacgcct 


ctctcatgat 


ttgcgccagc 


aacaacaaca 


acaacaacaa 


taataataac 


15 


aacaacaata 


atggagagca 


caagcagtcg 


agctacacat 


ccggatcacc 


gacacccaca 


acgcccacgc 


cgccaccgcc 


gcgttctggt 


gtaggttcca 


cctgcaacac 


ggccagcagc 




tccagcggct 


tcctggagct 


gctgctcagt 


ccggacaagt 


gccaggagct 


catccagtac 


20 


caggtgcagc 


acaacacgct 


gctcttcccg 


cagcagctgc 


tggactcgcg 


gctgctctcc 




tgggagatgc 


tgcaggagac 


gacggcgcga 


ctgctcttca 


tggcggtgcg 


ctgggtcaag 


25 


tgcctcatgc 


cgttccagac 


gctctccaag 


aacgaccagc 


atctgctgct 


ccaggaatcc 


tggaaggagc 


tctttctgct 


taatctcgcc 


caatggacta 


taccgctgga 


tctaacgccc 




atactggaat 


caccgctcat 


ccgcgaacgg 


gtgctgcagg 


acgaggccac 


acagacggag 


30 


atgaagacga 


tccaggagat 


cctgtgccgc 


ttccgccaga 


tcacacccga 


cggcagcgag 




gttggctgca 


tgaaggccat 


cgccctgttc 


gcacccgaaa 


ccgccggcct 


gtgcgacgtg 


35 


cagccggtgg 


agatgttgca 


ggatcaggcg 


cagtgcatcc 


tctccgacca 


tgtgcgactg 


cgctacccgc 


gccaagcaac 


ccgcttcggc 


aggctgttgc 


tcctgctgcc 


ctcgctgcgc 




accatccggg 


cggccaccat 


cgaggcgctg 


ttcttcaagg 


agaccatcgg 


caatgtgccc 


40 


attgctcgac 


tgctgcgcga 


catgtacacc 


atggaaccgg 


cacaggtgga 


caagtgaacc 




ggccacgcat 


ggccgtcgaa 


atgaaatcaa 


aatcgattcc 


ctagcaccta 


agcgccaccc 


45 


atcggtcgtc 


gtcatatgcg 


aacttatttg 


tattccaatg 


ggacccgaat 


cgtattcaga 


ttcactgcgg 


caggaggcgg 


tccaaatgtg 


gggcggaagc 


tgcagatgct 


atggttcgca 




ggacgccatg 


taatggaggc 


gtatgtacta 


accgcgctcc 


tccattggcg 


atgcagtccg 


50 


cgatgatggc 


gcactcccac 


acccacacca 


gtacccacac 


cttgatttat 


cgccggcaat 




gcgtcggagt 


ctccttactt 


tcgcttcgtt 


ttctaacat t 


tgtatcctta 


ttttatttca 


55 


tctt tttcca 


cggattt ttc 


gttttgactg 


cctgggcggc 


actctttatt 


tatctttcat 


tcgacgtttt 


gtcgtcgctt 


ttctaaaaat 


tccccatgtt 


atttcaacct 


ggcaaggacc 




tcgcaatccc 


attcccgcgc 


ccttacttac 


aaatcacttc 


ccatcccaca 


tccagcaatt 


60 


ccgtggtttg 


aattctttcg 


tgcattgact 


acgaaatacc 


ctttaatcag 


acaaataaag 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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aatattagtt gtaattcttt tttctgcaat ccagctctaa aacgggtttc ttaatcgaaa 3000 
tcgataaatg taaaaattat acatatcctt taccaacatt gtttgccta 3049 



<210> 17 
<211> 1974 
<212> DNA 

<213> Heliothis virescens 

<400> 17 

ccgagcggca cgcgtgccgc ttgatgctaa ctattctcat tgcgatcgta acataccacg 60 

cgctgtcaac acatgttcga acagtagtta tagtttcttt ggtgactgac acagacgctt 120 

aacgcctgta aacttcttac accgctcgta tagaacaaaa tgtattggat ttatgtgctt 180 

cctacaatag tgcttgaaca ggtgcagccc gcgagcggta gctagtgttg cggccagacc 240 

gctagtgaac tcggtgactg tgtgaacgtt gttgcggttg aacgccgcct tagggatttg 300 

taataaacag ccgcactgag cggccgagca tgacgatgga ccagcagaca ggcctcatgt 3 60 

ccctcaatat gtccccgttt gatctgagcc caggccccga agggtctggt tcaggtgggg 420 

gtccctcgag tgcatcgcaa cagtacgtgc cgcaaggcgc cgcgtatcaa tgcccccctg 480 

aacaacaatc tttcggctat gccaatctgg atgcttcata tctgtttcca acaggcgccg 540 

gtggcgaacc aggcgcctac ttgcctgcag ccggcaccgt gtgtgatcag acagacacca 600 

aggatgtgat cgaagaactg tgtcccgtct gtggagacaa agtcagcggt tatcactacg 660 

ggctactgac gtgcgagtct tgtaagggct tcttcaaaag aaccgttcaa aacaagaagg 720 

tgtacacgtg cgtcgctgaa cgagcctgcc acatagacaa aacacaacgg aaacgctgcc 780 

cattttgccg cttccaaaag tgcctcgatg ttggcatgaa actcgaagct gtacgagctg 840 

accgcatgcg cggtggtcgt aacaagttcg gccccatgta caaacgggat cgtgcccgca 900 

aactgcagat gatgcgacag cgacaaatcg cagtgcagac attgcgcggc tcacttggtg 960 

acagcggcct ggtgctcggc ttcgggtctc cttacgccac agttccagtc aagcaagaga 1020 

tacagatccc tcaagtgtcg tctctcacgt cttcacccga gtcgtcgcct gggccggcct 1080 

tactggccgc ccagccgcag ccaccacagc cgccgccgcc gccagctcac gacaagtggg 114 0 

aagcacattc gccgcactcc gcatctccgg acgccttcgc gtttgacgcg ccagccacag 1200 

cagccgcaac tccatccagc accgccgaac ccacaagcac tgaaaccctg cgagtctcac 1260 

ccatgatacg cgaatttgta caaactatcg atgatcgcga gtggcagaat tcgctatttg 1320 

ggctcttgca gagccagact tacaatcagt gtgaggtcga tctctttgag ttaatgtgca 1380 

aagtattgga ccaaaactta ttttcacaag tggattgggc gagaaacacc gtgttcttta 1440 

agtatctaaa ggtcgatgat caaatgaagc tattgcaaca ctcgtggtcg gacatgctgg 1500 

tgttagatca cctacaccag aggatgcaca atggcctgcc tgatgagact accctccaca 1560 
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acggacagaa gtttgacctc ctttgtctgg gactcctggg cgttccatcc ctggctgatc 1620 

acttcaacga actccagaac aaattagcag agctcaaatt cgatgtaccg gactacatat 1680 

gtgttaaatt cttgcttcta ttaaatcctg aggtgagagg catcgtgaat gtgaagtgcg 1740 

tccgagatgg ttatcagaca gtccaagccg cacttctgga ctatacgctg tcctgttatc 1800 

ctacgataca ggataaattc gacaaattag tgatggtagt acctgagatc catgccctag 1860 

cagctcgagg agaagagcac ctataccagc ggcactgtgc aggccaggcg cccacacaaa 1920 

cactcctaat ggaaatgcta cacgcgaagc gcaaatcttg aagtctcagt atgg 1974 

<210> 18 
<211> 543 
<212> PRT 

<213> Heliothis virescens 
<400> 18 

Met Thr Met Asp Gin Gin Thr Gly Leu Met Ser Leu Asn Met Ser Pro 
15 10 15 

Phe Asp Leu Ser Pro Gly Pro Glu Gly Ser Gly Ser Gly Gly Gly Pro 

20 25 30 

Ser Ser Ala Ser Gin Gin Tyr Val Pro Gin Gly Ala Ala Tyr Gin Cys 
35 40 45 

Pro Pro Glu Gin Gin Ser Phe Gly Tyr Ala Asn Leu Asp Ala Ser Tyr 
50 55 60 

Leu Phe Pro Thr Gly Ala Gly Gly Glu Pro Gly Ala Tyr Leu Pro Ala 
65 70 75 80 

Ala Gly Thr Val Cys Asp Gin Thr Asp Thr Lys Asp Val lie Glu Glu 

85 90 95 

Leu Cys Pro Val Cys Gly Asp Lys Val Ser Gly Tyr His Tyr Gly Leu 

100 105 110 

Leu Thr Cys Glu Ser Cys Lys Gly Phe Phe Lys Arg Thr Val Gin Asn 
115 120 125 

Lys Lys Val Tyr Thr Cys Val Ala Glu Arg Ala Cys His lie Asp Lys 
130 135 140 

Thr Gin Arg Lys Arg Cys Pro Phe Cys Arg Phe Gin Lys Cys Leu Asp 
145 150 155 160 

Val Gly Met Lys Leu Glu Ala Val Arg Ala Asp Arg Met Arg Gly Gly 

165 170 175 

Arg Asn Lys Phe Gly Pro Met Tyr Lys Arg Asp Arg Ala Arg Lys Leu 

180 185 190 

Gin Met Met Arg Gin Arg Gin lie Ala Val Gin Thr Leu Arg Gly Ser 
195 200 205 
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Leu Gly Asp Ser 
210 

Val Pro Val Lys 

225 

Ser Ser Pro Glu 



Gin Pro Pro Gin 

260 

His Ser Pro His 
275 

Ala Thr Ala Ala 
290 

Glu Thr Leu Arg 
305 

Asp Asp Arg Glu 



Thr Tyr Asn Gin 

340 

Leu Asp Gin Asn 
355 

Phe Phe Lys Tyr 
370 

Ser Trp Ser Asp 
385 

Asn Gly Leu Pro 



Leu Leu Cys Leu 

420 

Asn Glu Leu Gin 
435 

Tyr lie Cys Val 
450 

lie Val Asn Val 
465 

Ala Leu Leu Asp 



Phe Asp Lys Leu 

500 

Arg Gly Glu Glu 
515 



Gly Leu Val Leu 
215 

Gin Glu lie Gin 
230 

Ser Ser Pro Gly 
245 

Pro Pro Pro Pro 



Ser Ala Ser Pro 

280 

Ala Thr Pro Ser 
295 

Val Ser Pro Met 
310 

Trp Gin Asn Ser 
325 

Cys Glu Val Asp 



Leu Phe Ser Gin 

360 

Leu Lys Val Asp 
375 

Met Leu Val Leu 
390 

Asp Glu Thr Thr 
405 

Gly Leu Leu Gly 



Asn Lys Leu Ala 

440 

Lys Phe Leu Leu 
455 

Lys Cys Val Arg 
470 

Tyr Thr Leu Ser 
485 

Val Met Val Val 



His Leu Tyr Gin 

520 



Gly Phe Gly Ser 

220 

lie Pro Gin Val 

235 

Pro Ala Leu Leu 
250 

Pro Ala His Asp 
265 

Asp Ala Phe Ala 



Ser Thr Ala Glu 

300 

lie Arg Glu Phe 
315 

Leu Phe Gly Leu 
330 

Leu Phe Glu Leu 
345 

Val Asp Trp Ala 



Asp Gin Met Lys 

380 

Asp His Leu His 
395 

Leu His Asn Gly 
410 

Val Pro Ser Leu 
425 

Glu Leu Lys Phe 



Leu Leu Asn Pro 

460 

Asp Gly Tyr Gin 
475 

Cys Tyr Pro Thr 
490 

Pro Glu lie His 
505 

Arg His Cys Ala 



Pro Tyr Ala Thr 



Ser Ser Leu Thr 

240 

Ala Ala Gin Pro 
255 

Lys Trp Glu Ala 
270 

Phe Asp Ala Pro 
285 

Pro Thr Ser Thr 



Val Gin Thr He 

320 

Leu Gin Ser Gin 
335 

Met Cys Lys Val 
350 

Arg Asn Thr Val 
365 

Leu Leu Gin His 



Gin Arg Met His 

400 

Gin Lys Phe Asp 
415 

Ala Asp His Phe 
430 

Asp Val Pro Asp 
445 

Glu Val Arg Gly 



Thr Val Gin Ala 

480 

He Gin Asp Lys 
495 

Ala Leu Ala Ala 
510 

Gly Gin Ala Pro 
525 
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Thr Gin Thr Leu Leu Met Glu Met Leu His Ala Lys Arg Lys Ser 
530 535 540 



5 <210> 19 

<211> 2526 
<212> DNA 

<213> Heliothis virescens 
<400> 19 

10 cgccgccggc cgcacgcgcc ccatccgcgc ggctcatgcg tctcccgccg ctcccgctgc 60 

ccgacatgac tgtgaccgag tgccagcgcc gactcctgga gcccagcgcc gcagagccgc 12 0 

cgccccccgc gccccccact gattccgacg tgctgctcgg cagagttctc gcagagtttg 18 0 

15 

acggcactac ggtgctttgc cgcgtttgcg gtgataaagc tagtggattt cattacggcg 24 0 

tgcattcctg tgaaggttgc aagggtttct tcaggcggtc tatccaacag aagatccaat 300 

20 atagaccctg cacgaaaaac caacagtgct caatcctccg gatcaacaga aatcggtgcc 360 

agtactgccg attgaaaaaa tgcatcgcag taggcatgag cagagatgcc gtgcggttcg 42 0 

gacgtgttcc aaaacgcgag aaagcgcgta tccttgcagc aatgcagcaa tcttctacgt 480 

25 

cacgagcaaa tgagcaggct gcggccgcag agctcgatga tgctccgaga ttgctggcgc 54 0 

gcgtggtgcg cgctcacctc gacacctgcg agttcacgcg ggatcgcgtc gccgctatgc 600 

30 gtgcacgtgc tcgcgactgt cccatctact cgcaacccac tctggcttgc ccactaaacc 660 

cggcgccaga gcttcagtct gagaaggaat tctctcagcg atttgcacac gttattcgtg 720 

gtgtaataga cttcgctggc ctcatccccg gcttccagct tcttacacag gatgacaagt 780 

35 

ttacgctgct caaaagtgga ctatttgatg cgttatttgt gcggctaatc tgtatgtttg 840 

acgcgccact caatagtata atttgtctaa atggccaagt catgaaacgg gactccatac 900 

40 aaagtggtgc caatgcaagg ttccttgtgg attctacttt taaattcgcc gagcgcatga 960 

actctatgaa cttaacagat gcagaaatcg cccttttctg cgctatagtg ctcatcactc 1020 

cggataggcc gggacttcgc aatgtagaac tcgtggaaag aatgcatgcg cgtctgaagg 1080 

45 

catgcctgca aactgtggtc gcccagaata gaccggaccg accaggtttc ctcagggaac 1140 

ttatggatac ccttcccgac ctccgcacat taagcacact ccatactgaa aagctagtag 1200 

50 ttttccgtac tgaacacaaa gaattactac ggcagcagat gtggaccgat gaagaaggtg 12 60 

tgatgtcctg gggtgattct ggtgccgatg agtcagcacg cagccccatt gggtctgtgt 1320 

ccagcagcga gtccggcgaa gccgtcggtg actgcggcac gccactcttg gccgctacac 1380 

55 

tggccggacg ccggcgactc gactctcgtg gatccgtaga tgaagaagca ctaggcgtgg 1440 

cacatctagc tcacaacgga ctcactgtaa caccagtccg cccgcctcca cgataccgca 1500 

60 aacttgattc gccaactgat tctggcatcg agtcgggaaa tgagaagcac gagaggatag 1560 
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jatcttccct agaggagcac acggatgata 1620 

:tgtgctaaa acgcgtgcta gaagcgccac 1680 

iggcgtataa acctcacaag aagttccgtg 174 0 

:ccgtgtggt caggccagcg ccctccacgc 1800 

:ggcgcaccc ggcgcactcg ccgcgcccac 18 60 

:gctggcgaa gagcctgatg gaaggcccgc 192 0 

icatcatcca gcagtacatg cggcgcggcg 1980 

jcacgggtgg gctgctcacg tgctaccgcg 204 0 

;gctgcaggt ggacgtgtcc gacgctgacg 210 0 

:gccgccgcg ctcctttatg ccacagatgt 2160 

itgtctccgc caccccaccc cgacctactt 2220 

;tcgcagatc gttttaatat attatatgtg 2280 

'cggcgccgg acgaggagag cgcaatagac 234 0 

itttataaat gcaatgcagc gccggtcgcg 2400 

ttgtgagcgt gctcagtcat ttgtttgtta 2460 

tagaatcttg taaaaattaa aaaaaaaaaa 2520 

aaaaaa 2526 



<210> 20 
<211> 711 
<212> PRT 

<213> Heliothis virescens 
<400> 20 

Met Arg Leu Pro Pro Leu Pro Leu Pro Asp Met Thr Val Thr Glu Cys 
15 10 15 

Gin Arg Arg Leu Leu Glu Pro Ser Ala Ala Glu Pro Pro Pro Pro Ala 

20 25 30 

Pro Pro Thr Asp Ser Asp Val Leu Leu Gly Arg Val Leu Ala Glu Phe 
35 40 45 

Asp Gly Thr Thr Val Leu Cys Arg Val Cys Gly Asp Lys Ala Ser Gly 
50 55 60 

Phe His Tyr Gly Val His Ser Cys Glu Gly Cys Lys Gly Phe Phe Arg 
65 70 75 80 

Arg Ser lie Gin Gin Lys lie Gin Tyr Arg Pro Cys Thr Lys Asn Gin 

85 90 95 

Gin Cys Ser lie Leu Arg lie Asn Arg Asn Arg Cys Gin Tyr Cys Arg 

100 105 110 



ttggccccgg 


atcaggatgc 


tcaagcccgc 


ggcggccacc 


tccatcagcc 


gatgacatgc 


cgctgtacca 


caccacttca 


ttaatggacg 


caatgcgccg 


tgacaccggc 


gaagctgagg 


agccaccgca 


gcacccgcac 


cccgcgagcc 


tgcgcgcctc 


gctgtcatcc 


acgcactccg 


gtatgacccc 


ggagcagctg 


aagcgcaccg 


aggcgggcgc 


gcccgacggc 


tgccccatgc 


gcgcctcgcc 


cgcgccgcag 


cccgtcatgg 


cgccccttaa 


cctctccaag 


aagtcgccgt 


tggaggcgtg 


agatcggccc 


gcggagccac 


taaattaaat 


aatctagtta 


atagtaatcg 


a t aat t toat 


taaaacacat 




tgaaaaataa 


tttaagacta 


tataagtgta 


gtgtgtttgt 


tgtgtgtatt 


gtatgtgcgt 


atataatgac 


aggttgtcga 


actaaaataa 
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Leu Lys Lys Cys 
115 

Gly Arg Val Pro 
130 

Gin Ser Ser Thr 
145 

Asp Asp Ala Pro 



Thr Cys Glu Phe 

180 

Arg Asp Cys Pro 
195 

Pro Ala Pro Glu 
210 

His Val lie Arg 

225 

Gin Leu Leu Thr 



Phe Asp Ala Leu 

260 

Asn Ser lie lie 
275 

Gin Ser Gly Ala 
290 

Ala Glu Arg Met 

305 

Phe Cys Ala lie 



Val Glu Leu Val 

340 

Thr Val Val Ala 
355 

Leu Met Asp Thr 
370 

Glu Lys Leu Val 

385 

Gin Met Trp Thr 



Ala Asp Glu Ser 

420 



lie Ala Val Gly 

120 

Lys Arg Glu Lys 
135 

Ser Arg Ala Asn 
150 

Arg Leu Leu Ala 
165 

Thr Arg Asp Arg 



lie Tyr Ser Gin 

200 

Leu Gin Ser Glu 
215 

Gly Val lie Asp 
230 

Gin Asp Asp Lys 
245 

Phe Val Arg Leu 



Cys Leu Asn Gly 

280 

Asn Ala Arg Phe 
295 

Asn Ser Met Asn 
310 

Val Leu lie Thr 
325 

Glu Arg Met His 



Gin Asn Arg Pro 

360 

Leu Pro Asp Leu 
375 

Val Phe Arg Thr 
390 

Asp Glu Glu Gly 
405 

Ala Arg Ser Pro 



Met Ser Arg Asp 



Ala Arg lie Leu 

140 

Glu Gin Ala Ala 
155 

Arg Val Val Arg 
170 

Val Ala Ala Met 
185 

Pro Thr Leu Ala 



Lys Glu Phe Ser 

220 

Phe Ala Gly Leu 
235 

Phe Thr Leu Leu 
250 

lie Cys Met Phe 
265 

Gin Val Met Lys 



Leu Val Asp Ser 

300 

Leu Thr Asp Ala 
315 

Pro Asp Arg Pro 
330 

Ala Arg Leu Lys 
345 

Asp Arg Pro Gly 



Arg Thr Leu Ser 

380 

Glu His Lys Glu 
395 

Val Met Ser Trp 
410 

lie Gly Ser Val 
425 



Ala Val Arg Phe 
125 

Ala Ala Met Gin 



Ala Ala Glu Leu 

160 

Ala His Leu Asp 
175 

Arg Ala Arg Ala 
190 

Cys Pro Leu Asn 
205 

Gin Arg Phe Ala 



lie Pro Gly Phe 

240 

Lys Ser Gly Leu 
255 

Asp Ala Pro Leu 
270 

Arg Asp Ser lie 
285 

Thr Phe Lys Phe 



Glu lie Ala Leu 

320 

Gly Leu Arg Asn 
335 

Ala Cys Leu Gin 
350 

Phe Leu Arg Glu 
365 

Thr Leu His Thr 



Leu Leu Arg Gin 

400 

Gly Asp Ser Gly 
415 

Ser Ser Ser Glu 
430 
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Ser Gly Glu Ala 
435 

Leu Ala Gly Arg 
450 

Ala Leu Gly Val 
465 

Val Arg Pro Pro 



Gly He Glu Ser 

500 

Ser Gly Cys Ser 
515 

Arg Arg Pro Pro 
530 

Leu Glu Ala Pro 
545 

Tyr Lys Pro His 



Ala Glu Ala Arg 

580 

His Pro His Pro 
595 

Leu Arg Ala Ser 
610 

Met Glu Gly Pro 
625 

He Gin Gin Tyr 



Pro Met Arg Thr 

660 

Ala Pro Gin Pro 
675 

Ala Pro Leu Asn 
690 

Met Pro Gin Met 
705 



Val Gly Asp Cys 

440 

Arg Arg Leu Asp 
455 

Ala His Leu Ala 
470 

Pro Arg Tyr Arg 
485 

Gly Asn Glu Lys 



Ser Pro Arg Ser 

520 

Pro Ser Ala Asp 
535 

Pro Leu Tyr His 
550 

Lys Lys Phe Arg 
565 

Val Val Arg Pro 



Ala Ser Pro Ala 

600 

Leu Ser Ser Thr 
615 

Arg Met Thr Pro 
630 

Met Arg Arg Gly 
645 

Gly Gly Leu Leu 



Val Met Ala Leu 

680 

Leu Ser Lys Lys 
695 

Leu Glu Ala 
710 



Gly Thr Pro Leu 



Ser Arg Gly Ser 

460 

His Asn Gly Leu 
475 

Lys Leu Asp Ser 
490 

His Glu Arg He 
505 

Ser Leu Glu Glu 



Asp Met Pro Val 

540 

Thr Thr Ser Leu 
555 

Ala Met Arg Arg 
570 

Ala Pro Ser Thr 
585 

His Pro Ala His 



His Ser Val Leu 

620 

Glu Gin Leu Lys 
635 

Glu Ala Gly Ala 
650 

Thr Cys Tyr Arg 
665 

Gin Val Asp Val 



Ser Pro Ser Pro 

700 



Leu Ala Ala Thr 
445 

Val Asp Glu Glu 



Thr Val Thr Pro 

480 

Pro Thr Asp Ser 
495 

Val Gly Pro Gly 
510 

His Thr Asp Asp 
525 

Leu Lys Arg Val 



Met Asp Glu Ala 

560 

Asp Thr Gly Glu 
575 

Gin Pro Pro Gin 
590 

Ser Pro Arg Pro 
605 

Ala Lys Ser Leu 



Arg Thr Asp lie 

640 

Pro Asp Gly Cys 
655 

Gly Ala Ser Pro 
670 

Ser Asp Ala Asp 
685 

Pro Arg Ser Phe 



<210> 21 
<211> 2951 
<212> DNA 

<213> Heliothis virescens 

<400> 21 

gcacgagtca aattaacgtt ttattacgag tgtgattcat aatatcttca aagacatttg 60 
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tgtacagcaa gggggtgtat tgtttcatct 
gttactgaca ggttcatcgt tagccactga 

5 

gatagataaa ttaaatccaa tgatggagcc 
aggttttatg tcgccgatgt caccgccgga 
10 cctgcgagac gactccaccc cacccccagc 
aagtggttct aagcacctct gttctatatg 
agtatacagt tgtgaaggtt gcaaaggttt 

15 

gtacgcgtgc cgcgaagaac gcaactgcat 
gtactgcagg taccagaaat gtctcgcgtg 
20 gaggcagagg gccgccagag gtacggagga 
ggagttatca atcgagcggt tgctggagat 
gttccagttc cttcgtgtgg gacccgacag 

25 

ctccagcctt tgtcaaatag gcaacaaaca 
catcccgcac ttcagtcagc tggagatgga 
30 gaacgaactg ctgctcttcg ctatagcgtg 
agatggcgtg gacggcactg ggaacagaac 
gcctggtatg acgctgcacc gcaactcagc 

35 

ccgcgtgctg tcggagctgt cgctgaagat 
cgtcgcgctc aaggccatca tactgctcaa 
40 agaagtggaa gttttacgag aaaagatgtt 
gcgcagttcg gaggagggtc ggttcgcggc 
catttcactc aagagcttcg agcacctgtt 

45 

cgccggctac atccgcgacg cgctgcgcaa 
gtaaccggca cacggtgtta ctacatctgg 
50 tcgcgaccga atgtgcagct acataacgaa 
actcatggca attactgagg gggattttta 
aatgaaatat gccatcggta gtaataaagt 

55 

aaatggaaac tgctatgagt tttttgacaa 
cattatacga agctaccatt tgccggtcaa 
60 gctaattaga gtttattata tggaaagtta 



tcttataggt 


atatttagtt 


caacgacctt 


120 


actattctat 


gaaagtggta 


tatcgtagac 


180 


ctcgagagat 


tcagggctaa 


acttggaggg 


240 


gatgaagcca 


gacacggcga 


tgctagacgg 


300 


tttcaagaac 


taccccccga 


accatcccct 


360 


tggagataga 


gcgtcgggga 


aacattatgg 


420 


cttcaaaagg 


acggtaagaa 


aagacttaac 


480 


catagacaaa 


cgtcagagga 


acagatgcca 


540 


cggcatgaag 


agggaagcgg 


tgcaggagga 


600 


tgcacatccg 


agcagctcgg 


tgcaggtaca 


660 


ggagtcactg 


gtagctgatc 


ccagcgaaga 


720 


caatgtgcca 


cctaagttcc 


gcgcccctgt 


780 


aatagcggcg 


ctagtggtgt 


gggcgcgcga 


840 


agatcagatc 


ctgctcatca 


aaggctcctg 


900 


gcggtctatg 


gagttcctga 


cagaagagcg 


960 


cacatcgccg 


ccacaactca 


tgtgcctcat 


1020 


gctgcaggcg 


ggcgtggggc 


agatcttcga 


1080 


gcgcaccctg 


cgcgtcgacc 


aggccgagta 


1140 


cccagatgtg 


aagggactga 


aaaacaggca 


1200 


cctgtgcctg 


gacgagtact 


gccgccgctc 


1260 


gctgctgctg 


cgcctgcccg 


cgttacgttc 


1320 


cttcttccac 


ctggtggccg 


acaccagcat 


1380 


ccacgcgccg 


cccatcgaca 


ccaacatgat 


1440 


caactaaacg 


aacgcgttcg 


acaagcgata 


1500 


ctattgtcta 


atattgttgt 


ttgttgaaag 


1560 


atcaacgatg 


gtataactat 


gctggaaata 


1620 


agtgctaaag 


cagttcaaat 


ctatttctca 


1680 


acttttcgtc 


tctatttcat 


aaaattctat 


1740 


aggttcgagg 


ttaatgtcct 


ggctacggtc 


1800 


gctttagcag 


acgtggggag 


ggcgtgtcga 


1860 
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agtggttatg 


aacattttcg 


aggtcttaga 


ttataatctt 


gaaacctcac 


attatgtgcg 


1920 




tgcgtgtaac 


cgctgcgcga 


cctctaacgc 


gatgtaggtg 


tttatcgaat 


aaacggtacc 


1980 


5 


gaatacattt 


gttgctataa 


tttgtaggaa 


ttttggtata 


cgtttcgacg 


gataaacctt 


2040 




ataatttcgg 


taaatttaca 


aatcaggcct 


ttatttttcg 


agacagcaga 


cttgatgtta 


2100 


10 


tgttagctct 


aaagatcaaa 


tagttaacgt 


tggtgaaaac 


cttcagctat 


aagtacctgg 


2160 


gttataaata 


gcccttattg 


taaaggccct 


ttatgttatg 


ttcatgtttg 


agtacgagca 


2220 




catcaccgtt 


gtatgaagcg 


cgggcacgcg 


acggcagtcg 


tgtactaggt 


gtccgtgtgg 


2280 


15 


gcacggtatc 


gacacacgac 


gttaggtagt 


gtaaatgggg 


ccgccttttt 


tattacgtag 


2340 




aatatgatta 


acgtaaatta 


gtcattcatt 


tattagggaa 


ttttaatcgt 


acaggtaggt 


2400 


20 


gtccatagtg 


aattaatata 


atgttagact 


aatgtgtaat 


tttaacattt 


accaaatctc 


2460 


tcttcaaagc 


acgtacatta 


attattacgt 


atatttaatg 


gattgactat 


gtgtccactg 


2520 




agccagtgat 


aaaaatcatg 


ttgattcgta 


aaaacgaaaa 


tgaaagaaaa 


atatacaaaa 


2580 


25 


ttatgcttca 


tgtatttccc 


gtattataca 


catgttggtg 


ctttaacatg 


cattatttac 


2640 




atattatgca 


tatcaaatat 


gcatagtaat 


ccttaaattt 


aatgttactt 


gataaatcct 


2700 


30 


tattttcatt 


a coat - tacit - a 


f aaaaaat ta 


Taaaaactat 


aaat~atatrr 


1~ rcrcat'cjrar' 


2760 


aatgtatttt 


tattatgttt 


agacgagtaa 


cttatattgc 


atattaatat 


atacctcacc 


2820 




agtctcgggt 


ggtgggacta 


aacagataga 


atggcctaca 


cgctacattt 


ataaattata 


2880 


35 


acttgtattt 


ctactgacag 


gccactttat 


ttttgtacct 


tgtgagtgcc 


acgcattttc 


2940 



caattcgcta t 2951 



40 <210> 22 

<211> 420 

<212> PRT 

<213> Heliothis virescens 

<400> 22 



45 



lie Asp Lys Leu Asn Pro Met Met Glu Pro Ser Arg Asp Ser Gly Leu 
15 10 15 



Asn Leu Glu Gly Gly Phe Met Ser Pro Met Ser Pro Pro Glu Met Lys 

50 20 25 30 

Pro Asp Thr Ala Met Leu Asp Gly Leu Arg Asp Asp Ser Thr Pro Pro 

35 40 45 

55 Pro Ala Phe Lys Asn Tyr Pro Pro Asn His Pro Leu Ser Gly Ser Lys 
50 55 60 



His Leu Cys Ser lie Cys Gly Asp Arg Ala Ser Gly Lys His Tyr Gly 
65 70 75 80 



60 
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Val Tyr Ser 



Lys Asp Leu 

5 

Lys Arg Gin 
115 

10 Ala Cys Gly 

130 

Ala Arg Gly 
145 

15 

Glu Leu Ser 



Pro Ser Glu 

20 

Pro Pro Lys 
195 

25 Lys Gin He 
210 

Ser Gin Leu 
225 

30 

Asn Glu Leu 



Thr Glu Glu 

35 

Pro Pro Gin 
275 

40 Ser Ala Leu 
290 

Glu Leu Ser 
305 

45 

Val Ala Leu 



Lys Asn Arg 

50 

Leu Asp Glu 
355 

55 Ala Ala Leu 
370 

Ser Phe Glu 
385 

60 



Cys Glu Gly Cys 
85 

Thr Tyr Ala Cys 
100 

Arg Asn Arg Cys 



Met Lys Arg Glu 

135 

Thr Glu Asp Ala 
150 

lie Glu Arg Leu 
165 

Glu Phe Gin Phe 
180 

Phe Arg Ala Pro 



Ala Ala Leu Val 

215 

Glu Met Glu Asp 
230 

Leu Leu Phe Ala 
245 

Arg Asp Gly Val 
260 

Leu Met Cys Leu 



Gin Ala Gly Val 

295 

Leu Lys Met Arg 
310 

Lys Ala He He 
325 

Gin Glu Val Glu 
340 

Tyr Cys Arg Arg 



Leu Leu Arg Leu 

375 

His Leu Phe Phe 

390 



Lys Gly Phe Phe Lys 
90 

Arg Glu Glu Arg Asn 
105 

Gin Tyr Cys Arg Tyr 
12 0 

Ala Val Gin Glu Glu 

140 

His Pro Ser Ser Ser 

155 

Leu Glu Met Glu Ser 
170 

Leu Arg val Gly Pro 
185 

Val Ser Ser Leu Cys 
200 

Val Trp Ala Arg Asp 

220 

Gin He Leu Leu He 

235 

He Ala Trp Arg Ser 
250 

Asp Gly Thr Gly Asn 
265 

Met Pro Gly Met Thr 
280 

Gly Gin lie Phe Asp 

300 

Thr Leu Arg Val Asp 

315 

Leu Leu Asn Pro Asp 
330 

Val Leu Arg Glu Lys 
345 

Ser Arg Ser Ser Glu 
360 

Pro Ala Leu Arg Ser 

380 

Phe His Leu Val Ala 

395 



Arg Thr Val Arg 
95 

Cys He He Asp 
110 

Gin Lys Cys Leu 
125 

Arg Gin Arg Ala 



Val Gin Val Gin 

160 

Leu Val Ala Asp 
175 

Asp Ser Asn Val 
190 

Gin He Gly Asn 
205 

He Pro His Phe 



Lys Gly Ser Trp 

240 

Met Glu Phe Leu 

255 

Arg Thr Thr Ser 
270 

Leu His Arg Asn 
285 

Arg Val Leu Ser 



Gin Ala Glu Tyr 

320 

Val Lys Gly Leu 
335 

Met Phe Leu Cys 
350 

Glu Gly Arg Phe 

365 

He Ser Leu Lys 



Asp Thr Ser He 

400 
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Ala Gly Tyr lie Arg Asp Ala Leu Arg Asn His Ala Pro Pro lie Asp 

405 410 415 

Thr Asn Met Met 

420 

<210> 23 
<211> 645 
<212> DNA 

<213> Heliothis virescens 
<400> 23 

gcacgaggtg gcgaccgctt taccggcaaa cattatggag cctcgtcctg cgatggctgc 60 
aagggcttct tcaggcgaag cgtcaggaaa aatcatctgt acacatgcag gttcagcaga 120 
aattgtgtgg tggacaaaga caagagaaac cagtgcagat actgtaggct aaggaaatgc 180 
ttcaaggcgg gcatgaagaa agaagcggtg cagaacgagc gggaccgcat caactgccga 24 0 
agaccttcgt atgaggagcc gacgcaggcc aacgggctgt cggtagtgtc gctactcaat 3 00 
gctgaactgc tcagtaggaa ggttattgat gagacaaaca acgtgacaga cgcggagatc 3 60 
aacaaccgca agctggctaa gatcaacgac gtgtgcgaca gcatcaagca gcagctgctc 420 
atactggtcg agtgggccaa gtacataccc gccttcactg agctgcatct tgatgatcag 480 
gtggcgctgc tgcgcgcgca cgcaggcgag cacctgctgc tgggctgcgc gcgccgctcg 54 0 
ctgcacctca gcgacatcct gctgctcggc aacaactgca tcatcaccaa gcacaatatc 600 
gatggccgca tggacataga catcagcatg atcggcatgc gcgtg 645 

<210> 24 
<211> 215 
<212> PRT 

<213> Heliothis virescens 
<400> 24 

Ala Arg Gly Gly Asp Arg Phe Thr Gly Lys His Tyr Gly Ala Ser Ser 
15 10 15 

Cys Asp Gly Cys Lys Gly Phe Phe Arg Arg Ser Val Arg Lys Asn His 

20 25 30 

Leu Tyr Thr Cys Arg Phe Ser Arg Asn Cys Val Val Asp Lys Asp Lys 
35 40 45 

Arg Asn Gin Cys Arg Tyr Cys Arg Leu Arg Lys Cys Phe Lys Ala Gly 
50 55 60 

Met Lys Lys Glu Ala Val Gin Asn Glu Arg Asp Arg lie Asn Cys Arg 
65 70 75 80 

Arg Pro Ser Tyr Glu Glu Pro Thr Gin Ala Asn Gly Leu Ser Val Val 

85 90 95 

Ser Leu Leu Asn Ala Glu Leu Leu Ser Arg Lys Val lie Asp Glu Thr 

100 105 110 
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Asn Asn Val Thr Asp Ala Glu lie Asn Asn Arg Lys Leu Ala Lys lie 

115 120 125 

5 Asn Asp Val Cys Asp Ser lie Lys Gin Gin Leu Leu lie Leu Val Glu 
130 135 140 



10 



Trp Ala Lys Tyr lie Pro Ala Phe Thr Glu Leu His Leu Asp Asp Gin 
145 150 155 160 

Val Ala Leu Leu Arg Ala His Ala Gly Glu His Leu Leu Leu Gly Cys 

165 170 175 



Ala Arg Arg Ser Leu His Leu Ser Asp lie Leu Leu Leu Gly Asn Asn 
15 180 185 190 

Cys lie lie Thr Lys His Asn lie Asp Gly Arg Met Asp lie Asp lie 
195 200 205 

20 Ser Met lie Gly Met Arg Val 
210 215 



<210> 25 

25 <211> 716 

<212> DNA 

<213> Heliothis virescens 
<220> 

<221> misc_f eature 

30 <222> (1) . . (716) 

<223> n is a, c, g, or t 

<400> 25 





gcacgagtgc 


aagggcttct 


ttagacgatt 


acaaagcaca 


gtggtgaact 


accaatgtcc 


35 


acggaataaa 


gcctgcgtgg 


tggaccgcgt 


caacaggaat 


cgctgccagt 


actgcagatt 




gcagaaatgc 


ctcaaacttg 


gaatgagccg 


tgatgcggtg 


aagtttggac 


ggatgtcgaa 


40 


aaagcaacgt 


gagaaggtgg 


aagatgaggt 


gagattccat 


cgtgcacaga 


tgagagcgca 


gactgacaca 


gcgcccgatt 


cagtatacga 


cgcccaacaa 


cagacgccga 


gctcaagcga 




ccagttccat 


ggtcactaca 


atggctaccc 


gggctacgga 


tccccattgt 


cctcatacgg 


45 


gtacaacaac 


gcgggcccag 


cattacagtc 


caacatgggc 


ggcatacagc 


cccaaccgcc 




ccaacagcag 


ccctacgacg 


tgtctgctga 


ctacgtggac 


tccaccacag 


cttacgagcc 


50 


gaaacagaac 


cggggattct 


tggaccctga 


ttntattagt 


catgcggagg 


gtgacatcag 


caaagttctg 


gtgaagagtt 


tggcagaagc 


acatgcgaat 


accaacctta 


nactggnant 




cattcacgag 


atttcagaaa 


gccacaaaga 


cgtcttccca 


agctttncta 


tactatagnt 


55 


cctatgacgt 


accgaaaggn 


aaaaagggaa 


aaaaatattt 


tttttttggg 


gggggc 



<210> 26 

<211> 218 

60 <212> PRT 

<213> Heliothis virescens 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
716 
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<220> 

<221> misc_f eature 

<222> (1) . . (218) 

<223> Xaa is any amino acid 

<400> 26 

His Glu Cys Lys Gly Phe Phe Arg Arg Leu Gin Ser Thr Val Val Asn 
15 10 15 

Tyr Gin Cys Pro Arg Asn Lys Ala Cys Val Val Asp Arg Val Asn Arg 

20 25 30 

Asn Arg Cys Gin Tyr Cys Arg Leu Gin Lys Cys Leu Lys Leu Gly Met 
35 40 45 

Ser Arg Asp Ala Val Lys Phe Gly Arg Met Ser Lys Lys Gin Arg Glu 
50 55 60 

Lys Val Glu Asp Glu Val Arg Phe His Arg Ala Gin Met Arg Ala Gin 
65 70 75 80 

Thr Asp Thr Ala Pro Asp Ser Val Tyr Asp Ala Gin Gin Gin Thr Pro 

85 90 95 

Ser Ser Ser Asp Gin Phe His Gly His Tyr Asn Gly Tyr Pro Gly Tyr 

100 105 110 

Gly Ser Pro Leu Ser Ser Tyr Gly Tyr Asn Asn Ala Gly Pro Ala Leu 
115 120 125 

Gin Ser Asn Met Gly Gly lie Gin Pro Gin Pro Pro Gin Gin Gin Pro 
130 135 140 

Tyr Asp Val Ser Ala Asp Tyr Val Asp Ser Thr Thr Ala Tyr Glu Pro 
145 150 155 160 

Lys Gin Asn Arg Gly Phe Leu Asp Pro Asp Xaa lie Ser His Ala Glu 

165 170 175 

Gly Asp lie Ser Lys Val Leu Val Lys Ser Leu Ala Glu Ala His Ala 

180 185 190 

Asn Thr Asn Leu Xaa Leu Xaa Xaa lie His Glu lie Ser Glu Ser His 
195 200 205 

Lys Asp Val Phe Pro Ser Phe Xaa lie Leu 
210 215 



<210> 27 
<211> 5206 
<212> DNA 

<213> Drosophila melanogaster 
<400> 27 

ggctcgccca ttggagggcc cctgtcctgt ggcagcagct tgcccagctt ccaggagacc 60 

tactccttga agtacaacag cagcagcggt agcagccccc agcaggcgtc ctcctcctcc 12 0 

accgccgccc ccacgcccac tgaccaggtg ctgaccctca agatggacga ggactgcttc 18 0 

ccgcctctgt ccggcggctg gagtgccagt ccgcccgccc cctcccagct ccagcagctg 24 0 
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cacaccctgc agtctcaggc ccagatgtcg catcccaaca gcagcaacaa cagcagcaac 300 

aacgcgggca acagccacaa caacagtggg ggctacaact accacggcca cttcaatgcc 3 60 

atcaatgcca gcgccaatct gtcgcccagc tcctcggcca gttccctcta cgaatataat 420 

ggtgtttccg cagcggacaa cttctacgga caacagcagc agcagcaaca gcaaagctat 48 0 

cagcaacata actacaactc gcacaatggc gagcgttact cgctgcccac gtttcccacg 54 0 

atttcggagc tggctgcggc cactgctgct gtcgaagctg cggcggcggc cacagtgggc 600 

ggtccgccgc cagtacgccg agcatcgctg ccggttcagc gaaccgttct gccagccggc 660 

tccacggcgc agagccccaa gctggccaag atcacactga accagcggca ctcacatgcc 720 

catgcccatg ccctacagct caactcggca cccaattcgg cggcaagttc gccagcgagt 780 

gcggatctgc aggcgggccg tttgctccag gctccgtcgc agctgtgtgc tgtttgtggc 84 0 

gacaccgccg cctgtcagca ttatggagtg cgaacctgcg agggatgcaa gggatttttc 900 

aagcggaccg tgcagaaggg ctccaagtac gtctgcctgg cggacaagaa ttgcccggtg 96 0 

gacaagaggc gccgcaaccg ttgccagttc tgccggttcc agaagtgcct ggtcgtaggc 102 0 

atggtcaagg aagtggtgcg caccgactcg ttgaagggtc gccgcggaag actgccctca 1080 

aaaccgaaat cgccccagga gtcgccacca tcaccaccca tctcgttgat cacggccctg 1140 

gtgcgcagcc atgtcgacac gactccggat ccctcgtgcc tggactatag ccactacgag 12 00 

gagcagtcga tgagcgaggc agataaggtg caacagtttt accagctgct gaccagctcc 1260 

gtggacgtga tcaaacagtt cgccgagaag attcccggct acttcgatct cctgccggag 132 0 

gatcaggagc tgctcttcca gagcgcatcg ctggaactgt tcgtcctgcg gctggcctat 1380 

cgcgccagga tcgatgacac caagctgatc ttctgcaacg gcacggtgct ccaccgcacc 1440 

cagtgcctgc gctctttcgg cgagtggctc aacgacatca tggagttcag ccgcagcctg 1500 

cacaacctgg agatcgacat ctccgccttc gcctgcctct gtgccctaac cctcatcaca 1560 

gaacgccatg gcctgcggga gccgaagaag gtggagcagc tccagatgaa gatcattggc 162 0 

agtctgcgcg accacgtcac ctacaatgcc gaggcccaga agaagcagca ctacttcagc 168 0 

cgcctgctgg gcaagctgcc ggagctgagg tccctgagtg tccagggact gcagaggatc 174 0 

ttctacctga agctggagga cctggtaccc gcaccagctc tcatcgagaa catgttcgtc 1800 

accacattgc ccttctagag gcgatcatca agcgtatcat cacaacttgc ttccttaaaa 1860 

ctagccccta agtatgcctc ctaggatata cagagaaagg accccatagg acggacgcaa 1920 

ctagctttag tagaaccctg aaataaataa atctcacaac agcaaaaaca aaaccgaacc 1980 

gaacagaaat gaagcgaata gcagacccag gccatatctt tagtgtagag ctaggtagtt 204 0 
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accgggacag 


cccggctcct 


tcgataat ta 


eggacatgea 


tatttgagag 


ggggtttcca 


2100 




gtgcacagcc 


tatggctcct 


gcgtgactcg 


tcagcacgcg 


agctccaact 


tgttgacgtt 


2160 


5 


aattgttaaa 


ttgtttaatt 


tcaactgtca 


aaaccggaat 


caacggccgg 


geaegcaatg 


2220 




gcaacacttt 


ctatccccgg 


acttcgaagc 


ctgctcaaca 


tteggcacta 


eggaeggaca 


2280 


10 


aacaacggac 


agaaacagaa 


ctcactcttg 


ctctcttgcc 


ttttactaac 


ttctagtcaa 


2340 


ttgatt tagg 


cgaatcaaat 


aaataaataa 


ataaaataag 


ggcgtgcagc 


agtagtgtta 


2400 




tataatttat 


atccagaccc 


cageggttet 


cttcaaggaa 


atcccccaat 


gagttgeaca 


2460 


15 


aattgggata 


aagtacgata 


gectattatt 


cttatatttc 


ttttaaaagc 


tcgaagatag 


2520 




atgagaactg 


tgttggaaat 


tccactatca 


tatcatatag 


ttgctataag 


ccgtgcttgc 


2580 


20 


cctaagctaa 


gttagacccg 


cataaagt tg 


atagcccaac 


caagtatttc 


ggttatttcc 


2640 


tagactaagg 


tcctaatagt 


tataggct aa 


gactattctg 


ttcgattatc 


aatgcaccaa 


2700 




acagtgcaca 


atgagagtat 


aagtacct tc 


ttgtgatgat 


tgtgtctgac 


acagagagag 


2760 


25 


ttgcacacaa 


gcacacaaac 


tagecgataa 


gttactaaat 


acgatctaat 


atctaatata 


2820 




tataatataa 


tataatatat 


ataagtccaa 


gtattcggaa 


atccaagaac 


ecttgeataa 


2880 


30 


ccgcagttcg 


tacgttccaa 


acgagaaaag 


aactttatt t 


aatcctagac 


ccactcatct 


2940 


aagcttctaa 


agaatcgtat 


gtggatcgtt 


ggatctgtct 


ctctatatat 


gtgtgtgtgt 


3000 




gttatctcga 


tagaaaaccc 


ctctatgtga 


ttttgtgata 


gattggcatt 


gaactctata 


3060 


35 


tatttatata 


tatatatgtc 


tataatatat 


atacaegcat 


aaatatatat 


ttttatgtct 


3120 




aacttttgta 


tggtttattt 


tataegtace 


acttttcttt 


gataacaaaa 


agtaaaaaac 


3180 


40 


tcgttagata 


gcaaatat t t 


caaaggtatg 


ttacgaggac 


ttttcaaagg 


tatgttcgag 


3240 


gacttttcaa 


agtaccagtc 


tttagegact 


ttccaattaa 


cgttcgtatt 


aacgaaagac 


3300 




agattt tcta 


tgtgttaaat 


tgaaagactt 


ctataactat 


aactaaatgc 


aagctaagag 


3360 


45 


caaaaacaca 


aatccacaaa 


tccccaaagt 


gaataacata 


tctcttcaag 


ctttcgagtg 


3420 




ctcggaacac 


gtagaaccga 


aacccaagtg 


ttactaaatc 


catttaataa 


tcgcaagccg 


3480 


50 


ggggcgtcgg 


cgtggttaat 


aegttctcat 


tacctataca 


atttagatag 


atcattatta 


3540 


aattattgta 


catgtagcac 


atgaaatgtt 


cgacaactag 


attttgtacc 


atcttaaaga 


3600 




agaacctagg 


ccaagctaaa 


ctaagtataa 


actatgatct 


geatgegget 


agctgtagct 


3660 


55 


fcA 1*- ^— j C*4 1—1 


t* Fi net" arat" a 


t* \— 1 — £A ^ 1— x — j 


d C* £A CI 


n— - L— \* ^ ^ ^ L> 




^ 7? 0 




cgttctaaac 


gcgacgacta 


actctcccaa 


ctgcgaactc 


taccaattaa 


gagaaattcc 


3780 


60 


cagaaaatgt 


gtcaggattt 


caaagcgtcc 


catctcactt 


gaacccaccc 


aatcaacaaa 


3840 


tacaaatcct 


agggaagttg 


agaggttcag 


caaccataga 


gcaatatttc 


ataagaaaac 


3900 
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gcaccttaaa 


ttaccgaaaa 


acatagatta 


acctgatctt 


gtaacgt ttg 


ggagcgataa 


taagccagga 


• 

ttaaacagga 


acagttaggt 


gaccaaatca 


gttcgaaacg 


aagatgatag 


atagttcggt 


tcgaaaccct 


aaacgcgatg 


ccattttagc 


cgttacaaca 


ttggatatca 


atcatgcaca 


tgaatatgaa 


tatgaatatg 


aatattatag 


agatatatct 


agctatagga 


acctactttg 


tacctacacg 


acatggaaac 


atcaaaccta 


catgcatatt 


tacacacata 


tattttgaat 


agagcgacga 


cttttacaag 


ttgcgtagaa 


agctatagct 


atagcttgat 


atggccatcc 


cagagcgagc 


atatacatat 


attttgggtt 


attgttcttt 


tgtaatttta 


taaatgcata 


catatttatg 


tactacgtga 


atgtcaagtg 


tggattcata 


tttttgagat 


acagctacaa 


aaacgaaaca 


aaagaaaata 


aaacaaaaca 


gaagagtaaa 


cggtgaaatt 


ttttcgatga 


aacaatttta 


aatgagaact 


ttttaatatt 


gctattaaag 


gatatacata 


tacacactaa 


catacaatat 


atattttact 


atgtaacgga 


tagaattaag 


ctagatgcag 


cgcataaagc 


tttatacaac 


aaattgaaaa 


gcaacagaag 


aaattggcac 


aaattaaatt 


tatatagcat 


aattagacgt 


ccttcgcaag 


ataatgttat 


tgctaataag 


agcgtcaatc 


ggtacatcgg 


gcgctatt tc 


ccactacacc 


cccaaccaca 


caatagataa 


cctaagcaat 


gtatgtacat 


tagctatgta 


tatccagccc 


acttatgcgc 


ctactactag 


aaatgcagaa 


agcagaaaga 


gaggtgaaac 


ctatagacgc 


tatcacaaat 


gtctatctga 


tagacatcgg 


tactaccaat 


gctatattgc 


cagttgtgta 


atttactctt 


atttgatcgt 


ttcatttacc 


agt taagaac 


ccaaatcata 


taagtgttat 


gatggaagaa 


ctataacttg 


caattcaatt 


aactctgcat 


acgataacaa 


gcaaagcgaa 


tcatttcatt 


tcgatttaat 


ctttaattat 


atatacttaa 


acgatgtaag 


cccaaaacaa 


acgttttttc 


tatatctgtc 


ttttgagcaa 


attagttata 


cgcaaaacca 


aaccgtattt 


acataaatgt 


atacaaaaca 


aatcgtatat 


tttcattggt 


ttgaaataaa 


tacataaaac 


aaaaaaaaaa 


aaaaaa 





<210> 28 
<211> 551 
<212> PRT 

<213> Drosophila melanogaster 
<400> 28 

Met Asp Glu Asp Cys Phe Pro Pro Leu Ser Gly Gly Trp Ser Ala Ser 
15 10 15 

Pro Pro Ala Pro Ser Gin Leu Gin Gin Leu His Thr Leu Gin Ser Gin 

20 25 30 

Ala Gin Met Ser His Pro Asn Ser Ser Asn Asn Ser Ser Asn Asn Ala 
35 40 45 



3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5206 
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Gly Asn Ser His 
50 

Asn Ala lie Asn 

65 

Ser Leu Tyr Glu 



Gin Gin Gin Gin 

100 

Ser His Asn Gly 
115 

Glu Leu Ala Ala 
13 0 

Val Gly Gly Pro 
145 

Thr Val Leu Pro 



lie Thr Leu Asn 

180 

Leu Asn Ser Ala 
195 

Leu Gin Ala Gly 
210 

Cys Gly Asp Thr 
225 

Gly Cys Lys Gly 



Val Cys Leu Ala 

260 

Arg Cys Gin Phe 
275 

Lys Glu Val Val 
290 

Pro Ser Lys Pro 
3 05 

Ser Leu lie Thr 



Pro Ser Cys Leu 

340 

Ala Asp Lys Val 
355 



Asn Asn Ser Gly 
55 

Ala Ser Ala Asn 
70 

Tyr Asn Gly Val 
85 

Gin Gin Gin Gin 



Glu Arg Tyr Ser 

12 0 

Ala Thr Ala Ala 
135 

Pro Pro Val Arg 
150 

Ala Gly Ser Thr 
165 

Gin Arg His Ser 



Pro Asn Ser Ala 

200 

Arg Leu Leu Gin 
215 

Ala Ala Cys Gin 
230 

Phe Phe Lys Arg 
245 

Asp Lys Asn Cys 



Cys Arg Phe Gin 

280 

Arg Thr Asp Ser 
295 

Lys Ser Pro Gin 
310 

Ala Leu Val Arg 

325 

Asp Tyr Ser His 



Gin Gin Phe Tyr 

360 



Gly Tyr Asn Tyr 

60 

Leu Ser Pro Ser 
75 

Ser Ala Ala Asp 
90 

Ser Tyr Gin Gin 
105 

Leu Pro Thr Phe 



Val Glu Ala Ala 

140 

Arg Ala Ser Leu 
155 

Ala Gin Ser Pro 
170 

His Ala His Ala 
185 

Ala Ser Ser Pro 



Ala Pro Ser Gin 

220 

His Tyr Gly Val 
235 

Thr Val Gin Lys 
250 

Pro Val Asp Lys 
265 

Lys Cys Leu Val 



Leu Lys Gly Arg 

300 

Glu Ser Pro Pro 
315 

Ser His Val Asp 
330 

Tyr Glu Glu Gin 
345 

Gin Leu Leu Thr 



His Gly His Phe 



Ser Ser Ala Ser 

80 

Asn Phe Tyr Gly 
95 

His Asn Tyr Asn 
110 

Pro Thr lie Ser 
125 

Ala Ala Ala Thr 



Pro Val Gin Arg 

160 

Lys Leu Ala Lys 

175 

His Ala Leu Gin 
190 

Ala Ser Ala Asp 
205 

Leu Cys Ala Val 



Arg Thr Cys Glu 

240 

Gly Ser Lys Tyr 
255 

Arg Arg Arg Asn 
270 

Val Gly Met Val 
285 

Arg Gly Arg Leu 



Ser Pro Pro lie 

320 

Thr Thr Pro Asp 

335 

Ser Met Ser Glu 
350 

Ser Ser Val Asp 
365 
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Val lie Lys Gin Phe Ala Glu Lys lie Pro Gly Tyr Phe Asp Leu Leu 
370 375 380 

Pro Glu Asp Gin Glu Leu Leu Phe Gin Ser Ala Ser Leu Glu Leu Phe 
385 390 395 400 

Val Leu Arg Leu Ala Tyr Arg Ala Arg lie Asp Asp Thr Lys Leu lie 

405 410 415 

Phe Cys Asn Gly Thr Val Leu His Arg Thr Gin Cys Leu Arg Ser Phe 

420 425 430 

Gly Glu Trp Leu Asn Asp lie Met Glu Phe Ser Arg Ser Leu His Asn 
435 440 445 

Leu Glu lie Asp lie Ser Ala Phe Ala Cys Leu Cys Ala Leu Thr Leu 
450 455 460 

lie Thr Glu Arg His Gly Leu Arg Glu Pro Lys Lys Val Glu Gin Leu 
465 470 475 480 

Gin Met Lys lie lie Gly Ser Leu Arg Asp His Val Thr Tyr Asn Ala 

485 490 495 

Glu Ala Gin Lys Lys Gin His Tyr Phe Ser Arg Leu Leu Gly Lys Leu 

500 505 510 

Pro Glu Leu Arg Ser Leu Ser Val Gin Gly Leu Gin Arg lie Phe Tyr 
515 520 525 

Leu Lys Leu Glu Asp Leu Val Pro Ala Pro Ala Leu lie Glu Asn Met 
530 535 540 

Phe Val Thr Thr Leu Pro Phe 



<210> 29 
<211> 4778 
<212> DNA 

<213> Drosophila melanogaster 
<400> 29 

gaaaaaacaa acattttgct acttcgtcgc aggcgggact gtgttgcgtc gtgtgatcgc 60 

tagagcggtt gtggaatcgg attcgagcgc aaaacaccgt tcatgctgtg aaaaatccga 12 0 

tatttgtcgt gcaataattt cctcgattgg catcaagtgg cttccagtcg ggtacatatt 180 

gcacaagaaa tgttatacgc ataatgtgca cgcaaattaa acgaattctc tatgaaaatg 240 

tgactagaat gtgagtcgaa caaaacgagt aaaacgtgaa atcccaactg gcttttgggt 3 00 

aacaaatctt atcaacacag caacggaaat acattaaaat cttgatagac tgagaaaggg 360 

acaattggaa tacttttagt tatttttaaa tgttttacaa cacaatggaa ctgcatcaac 420 

gacacctctc aaacttttac aaattgcaca actgagaaat agtctttgat aaataaataa 480 

aatataagaa atcgctactg aaacaagatg ccaaacatgt ccagcatcaa agcggagcag 54 0 

caaagcggtc ctcttggagg aagtagcggc tatcaagtac cggtcaacat gtgcaccacc 600 
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acagtcgcga atacgacgac cactttggga 
cacaacgtct ccgtgacaaa catcaagtgc 

5 

aacatggtgc cggttatcgc aaactacgtt 
cattcaaatc atagggagtc cgattcggag 
10 gttcggcgaa ggacggcggc ggacaaaaat 
ctgagcgata ctgaggtcaa cgggggcgaa 
agtgaggtgg tccctgctgt tgcaccccca 

15 

acagagctag agaacattgc aggcgagatg 
aacacacaac atcacgctgc cacaaaatta 
20 aatctcaagt tcgaaccgcc tctgggagac 
tccagcagtg gaggccacct accactgcca 
tccgtctaca cgcacagctc cccctcgcag 

25 

actccgtctc tgagccgcaa caacagcgac 
tccgaattca gtcccacaca ctcgcccatt 
30 ctctatggca accaccatgg tatttaccgc 
ccgtccagtg ggcaggaggc gcagaacctg 
acagtgggct taggatcttc gcaccccgca 

35 

atcaactcgc cctgccccat ctgcggtgac 
tcctgcgagt cttgcaaggg cttcttcaag 
40 tgcgtgcgtg gtggaccatg tcaggtcagc 
cgcttcgaga agtgtctgca gaagggaatg 
cgtggcggcc gctccacata ccagtgctcc 

45 

ctgcttagtc ctgatcaagc ggcagcagct 
cagccgcacc agcgactaca tcaactaaat 
50 acttctcttc cagccagccc tagtttggca 
gagacgggca agcaaagcct ccgaacggga 
gatgtagagc atctgtggca gtacaccgat 

55 

tccgcattcg cctctggcag ctcttcgtcg 
catgcacaac tcaccaatcc actactggct 
60 gccaatcctg atcttatcgc tcatctctgc 
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agctccgccg 


ggggagccac 


tggctcccgg 


660 


gaactagacg 


aactaccgtc 


accgaacggc 


720 


cacggtagct 


tgcgcattcc 


actcagtgga 


780 


gaggagctgg 


caagtattga 


gaacttgaag 


840 


ggtcctcgtc 


caatgtcctg 


ggagggcgag 


900 


gagctgatgg 


aaatggagcc 


aacaattaag 


960 


caacccgtct 


gcgcactaca 


accgataaaa 


1020 


cagattcaag 


ggaagtgtta 


cccccagtcc 


1080 


aaagtggccc 


cgacgcaaag 


tgatccgatc 


1140 


aattctccgc 


tactggctgc 


acgtagcaag 


1200 


acgaatccca 


gtcccgactc 


cgccatacat 


1260 


tcgcctctga 


cgtcgcgcca 


cgccccctac 


1320 


gcctcgcaca 


gtagctgcta 


cagctatagc 


1380 


caagcgcgtc 


atgccccacc 


cgccggcacg 


1440 


cagatgaagg 


tggaagcctc 


atccactgtg 


1500 


agtatggact 


ctgcctctag 


caatctggat 


1560 


tctccggcgg 


gcatatcacg 


tcagcagttg 


1620 


aagatcagcg 


gatttcatta 


ctttattttc 


1680 


cgcaccgtgc 


aaaatcgcaa 


gaactacgtg 


1740 


atttccacgc 


gcaagaaatg 


tccagcctgc 


1800 


aaactagaag 


cgattcggga 


ggaccgaacc 


1860 


tacacgctgc 


ccaactcaat 


get tagtccg 


1920 


gccgccgcag 


cagcagtggc 


aagtcagcag 


1980 


ggatttggag 


gtgtacccat 


tccctgctct 


2040 


ggaacttcgg 


tcaagtcgga 


agagatggcg 


2100 


agcgtaccac 


cactactgca 


ggaaatcatg 


2160 


gcagagctgg 


cccgcatcaa 


ccaaccactg 


2220 


tcgtcatcgt 


caggtacatc 


ctcaggcgcc 


2280 


agtgctggtc 


tctcgtccaa 


tggcgagaat 


2340 


aacgtggctg 


atcaccgtct 


ttataaaatc 


2400 
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gtcaaatggt 


gcaagagctt 


gccgcttt tt 


aagaacattt 


cgatcgatga 


ccaaatctgc 


2460 




ttgctcatta 


actcgtggtg 


cgagctgttg 


ctcttctcct 


gctgttttag 


atcaattgat 


2520 


5 


actcctggag 


agattaaaat 


gtcacaaggc 


aggaagataa 


ccctatcgca 


ggccaaatca 


2580 




aatggcttgc 


agacttgcat 


tgaacggatg 


ctcaacctaa 


cagatcacct 


gaggcgattg 


2640 


10 


cgcgttgatc 


gctacgaata 


tgttgccatg 


aaagttattg 


tgctgttgca 


gtcagatacg 


2700 


acagagttac 


aggaagcggt 


aaaggtgcgc 


gagtgtcagg 


aaaaagcttt 


gcagagcttg 


2760 




caagcttaca 


ccctggcgca 


ttatcctgac 


acgcaatcca 


agtttgggga 


gcttttgcta 


2820 


15 


cgcattcctg 


atttgcagcg 


aacgtgccag 


cttggcaagg 


agatgttgac 


gatcaagact 


2880 




cgcgatggag 


ctgatttcaa 


tttgctaatg 


gagcttttgc 


gcggagagca 


ttgacaattg 


2940 


20 


ataactaaga 


cggaaatctt 


ttaccattgg 


caaaacaagt 


ttcacatatt 


tagtattaga 


3000 


tatatatatt 


ctatagataa 


gatccttact 


gtaagttctg 


aaaacatgtg 


cctaaaaacc 


3060 




aaagccacga 


tagcagtcac 


atcaggccca 


ctggtcgaga 


ttaaatccaa 


gagcaagatt 


3120 


25 


gccaaatttt 


tacaccaata 


tatattttga 


tatgagccat 


gtgcagggcc 


tcagatcgct 


3180 




gttgttgtcg 


gctaaagttt 


cagtaaggaa 


agtatatatt 


gattttgcta 


tttatacata 


3240 


30 


tttgacttat 


gtatagtgta 


aactaaagca 


cacatggaaa 


atgaaaagac 


taaacaaatt 


3300 


tatttaaaaa 


ttacttttac 


tattatagaa 


aaaggggaaa 


aataaaaaac 


acaaaggcag 


3360 




agaagaaaat 


ttagttacaa 


caggtagcga 


catttttata 


ttttct tata 


taaggaaata 


3420 


35 


ttcaatgtat 


tttaaatata 


aagccaaacc 


cgatttggtt 


tggggaagag 


ctactgaaat 


3480 




ttttgatatc 


tatatattca 


ttactagaag 


acgaatgaat 


gtatccaatg 


tttaaatgtt 


3540 


40 


gtagcgttta 


gttttagtgc 


aatttcacac 


atgtctacat 


acatgaatat 


tcagcgagat 


3600 


atgtctgcaa 


actattataa 


agcaaaagac 


cactcgaaat 


cgccatcact 


ggttggctaa 


3660 




gactattcca 


gttatgctgt 


ttgttgcata 


aaaaaccaca 


actacgtaca 


tcaataaaat 


3720 


45 


gtataatttt 


ttattggagt 


tttagatttg 


tattaacttc 


ttccttataa 


ttacgattat 


3780 




tattattatt 


actaatttta 


tgattattgt 


gtaacactga 


cttaaatagc 


tgaaacaaat 


3840 


50 


cctgcaacag 


gatttaaaac 


acctgaatac 


acaaaacatt 


ataacatgaa 


tacattttgc 


3900 


ttatggccta 


gatagtttga 


tatgtacttt 


gatatgtatg 


catgtgtcta 


tatgtgagtg 


3960 




cgtccataca 


aattcctgtc 


ccaccagaaa 


aatcacacgc 


aataaaaaat 


tccaaaatac 


4020 


55 


taaactcQta 


tct acaaacra 


aaoat t aaaa 


aacaaat taa 


fcoaa t aacjaa 


t atcrt tcrcca 


408 0 




gaagtccaag 


agatttggct 


gaaagtatcg 


acaaattttc 


aacacatcgt 


tcatggatat 


4140 


60 


tgtgctaaca 


ctctcagttt 


gaaaatcatt 


ttctgttaaa 


ctttctatat 


aataagttct 


4200 


ccattcgatt 


ttgtatttac 


aatttgtttc 


tttaattttc 


ctttatcagt 


tgtatctatg 


4260 
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aaacatgagg atcacagttc atattgatcg tgttcttctg ccgtacaccg cttctgtccg 4320 

ttaatgtaaa ccataagtat aaatgaaatt agttaaatgt ttatttataa ataaagcgct 4380 

ataataaatt tcaatacatt tatcatagtt aactgattaa gaccactgaa atcaaaaata 4440 

ttttatttac taagcaaagc acacgcaaac aatttataat gtttattacg ttaacaacaa 4500 

actcattttt aataattctt tatgaataca caaagttacg caattttccc tctaggcgca 4560 

* 

ttgcttaaat agttaaagaa aaataataaa cccatagcgc aatatttaat gtaaaacagt 4 620 

tttccttgcg tgtgatgttt cctctagcta cgtacaaatt catcatttat taaatttaaa 4680 

actcaatttt gcttttaaat aaatttaata agtaaaattc aacaataatt gatatacaat 4740 

tgtcaatgca atattttgta ataaaaatgc gaaaaatc 4778 



<210> 30 
<211> 808 
<212> PRT 

<213> Drosophila melanogaster 
<400> 30 

Met Pro Asn Met Ser Ser lie Lys Ala Glu Gin Gin Ser Gly Pro Leu 
15 10 15 

Gly Gly Ser Ser Gly Tyr Gin Val Pro Val Asn Met Cys Thr Thr Thr 

20 25 30 

Val Ala Asn Thr Thr Thr Thr Leu Gly Ser Ser Ala Gly Gly Ala Thr 
35 40 45 

Gly Ser Arg His Asn Val Ser Val Thr Asn lie Lys Cys Glu Leu Asp 
50 55 60 

Glu Leu Pro Ser Pro Asn Gly Asn Met Val Pro Val lie Ala Asn Tyr 
65 70 75 80 

Val His Gly Ser Leu Arg lie Pro Leu Ser Gly His Ser Asn His Arg 

85 90 95 

Glu Ser Asp Ser Glu Glu Glu Leu Ala Ser lie Glu Asn Leu Lys Val 

100 105 110 

Arg Arg Arg Thr Ala Ala Asp Lys Asn Gly Pro Arg Pro Met Ser Trp 
115 120 125 

Glu Gly Glu Leu Ser Asp Thr Glu Val Asn Gly Gly Glu Glu Leu Met 
130 135 140 

Glu Met Glu Pro Thr lie Lys Ser Glu Val Val Pro Ala Val Ala Pro 
145 150 155 160 

Pro Gin Pro Val Cys Ala Leu Gin Pro lie Lys Thr Glu Leu Glu Asn 

165 170 175 

He Ala Gly Glu Met Gin lie Gin Gly Lys Cys Tyr Pro Gin Ser Asn 

180 185 190 
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Thr Gin His His 
195 

Asp Pro lie Asn 
210 

Leu Leu Ala Ala 
225 

Pro Thr Asn Pro 



Ser Ser Pro Ser 

260 

Pro Ser Leu Ser 
275 

Ser Tyr Ser Ser 
290 

His Ala Pro Pro 
305 

Arg Gin Met Lys 



Glu Ala Gin Asn 

340 

Val Gly Leu Gly 

355 

Gin Gin Leu lie 
370 

Gly Phe His Tyr 
385 

Lys Arg Thr Val 



Pro Cys Gin Val 

420 

Phe Glu Lys Cys 
435 

Asp Arg Thr Arg 
450 

Pro Asn Ser Met 
465 

Ala Ala Ala Ala 



Leu His Gin Leu 

500 



Ala Ala Thr Lys 

200 

Leu Lys Phe Glu 
215 

Arg Ser Lys Ser 
230 

Ser Pro Asp Ser 
245 

Gin Ser Pro Leu 



Arg Asn Asn Ser 

280 

Glu Phe Ser Pro 
295 

Ala Gly Thr Leu 
310 

Val Glu Ala Ser 
325 

Leu Ser Met Asp 



Ser Ser His Pro 

360 

Asn Ser Pro Cys 
375 

Phe lie Phe Ser 
390 

Gin Asn Arg Lys 
405 

Ser lie Ser Thr 



Leu Gin Lys Gly 

440 

Gly Gly Arg Ser 
455 

Leu Ser Pro Leu 
470 

Ala Ala Val Ala 
485 

Asn Gly Phe Gly 



Leu Lys Val Ala 



Pro Pro Leu Gly 

220 

Ser Ser Gly Gly 
235 

Ala lie His Ser 
250 

Thr Ser Arg His 
265 

Asp Ala Ser His 



Thr His Ser Pro 

300 

Tyr Gly Asn His 
315 

Ser Thr Val Pro 
330 

Ser Ala Ser Ser 
345 

Ala Ser Pro Ala 



Pro lie Cys Gly 

380 

Cys Glu Ser Cys 
395 

Asn Tyr Val Cys 
410 

Arg Lys Lys Cys 
425 

Met Lys Leu Glu 



Thr Tyr Gin Cys 

460 

Leu Ser Pro Asp 
475 

Ser Gin Gin Gin 
490 

Gly Val Pro He 
505 



Pro Thr Gin Ser 
205 

Asp Asn Ser Pro 



His Leu Pro Leu 

240 

Val Tyr Thr His 
255 

Ala Pro Tyr Thr 
270 

Ser Ser Cys Tyr 
285 

He Gin Ala Arg 



His Gly He Tyr 

320 

Ser Ser Gly Gin 
335 

Asn Leu Asp Thr 
350 

Gly He Ser Arg 
365 

Asp Lys He Ser 



Lys Gly Phe Phe 

400 

Val Arg Gly Gly 
415 

Pro Ala Cys Arg 
430 

Ala He Arg Glu 
445 

Ser Tyr Thr Leu 



Gin Ala Ala Ala 

480 

Pro His Gin Arg 
495 

Pro Cys Ser Thr 
510 
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Ser Leu Pro Ala 
515 

Glu Met Ala Glu 
530 

Pro Leu Leu Gin 
545 

Asp Ala Glu Leu 



Gly Ser Ser Ser 

580 

Ala Gin Leu Thr 
595 

Gly Glu Asn Ala 
610 

Asp His Arg Leu 
625 

Phe Lys Asn lie 



Trp Cys Glu Leu 

660 

Pro Gly Glu lie 
675 

Ala Lys Ser Asn 
690 

Thr Asp His Leu 
705 

Met Lys Val lie 



Ala Val Lys Val 

740 

Ala Tyr Thr Leu 
755 

Leu Leu Leu Arg 
770 

Glu Met Leu Thr 
785 

Met Glu Leu Leu 



<210> 31 
<211> 2561 
<212> DNA 



Ser Pro Ser Leu 

520 

Thr Gly Lys Gin 
535 

Glu lie Met Asp 
550 

Ala Arg lie Asn 
565 

Ser Ser Ser Ser 



Asn Pro Leu Leu 

600 

Asn Pro Asp Leu 
615 

Tyr Lys lie Val 
630 

Ser lie Asp Asp 
645 

Leu Leu Phe Ser 



Lys Met Ser Gin 

680 

Gly Leu Gin Thr 
695 

Arg Arg Leu Arg 
710 

Val Leu Leu Gin 
725 

Arg Glu Cys Gin 



Ala His Tyr Pro 

760 

lie Pro Asp Leu 
775 

lie Lys Thr Arg 
790 

Arg Gly Glu His 
805 



Ala Gly Thr Ser 



Ser Leu Arg Thr 

540 

Val Glu His Leu 
555 

Gin Pro Leu Ser 
570 

Ser Gly Thr Ser 
585 

Ala Ser Ala Gly 



lie Ala His Leu 

620 

Lys Trp Cys Lys 
635 

Gin lie Cys Leu 
650 

Cys Cys Phe Arg 
665 

Gly Arg Lys lie 



Cys lie Glu Arg 

700 

Val Asp Arg Tyr 
715 

Ser Asp Thr Thr 
730 

Glu Lys Ala Leu 
745 

Asp Thr Gin Ser 



Gin Arg Thr Cys 

780 

Asp Gly Ala Asp 
795 



Val Lys Ser Glu 
525 

Gly Ser Val Pro 



Trp Gin Tyr Thr 

560 

Ala Phe Ala Ser 
575 

Ser Gly Ala His 
590 

Leu Ser Ser Asn 
605 

Cys Asn Val Ala 



Ser Leu Pro Leu 

640 

Leu lie Asn Ser 
655 

Ser lie Asp Thr 
670 

Thr Leu Ser Gin 
685 

Met Leu Asn Leu 



Glu Tyr Val Ala 

720 

Glu Leu Gin Glu 
735 

Gin Ser Leu Gin 
750 

Lys Phe Gly Glu 
765 

Gin Leu Gly Lys 



Phe Asn Leu Leu 

800 
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<213> Drosophila melanogaster 
<400> 31 

ctacgcaaaa taaaacgtac atgaaatgtt 
5 acagtttata tcgtcgctga atatatcgcc 
gccctccagt cccgctctgg ccgccggtgg 



caacaacaac agcgccagcg gcaacaacac 

10 

caacgacaat gatgcacacg ttctaacgaa 
gcagttggcc ggaggcggtg ggagtggcag 
15 caaccacggc aaccaccacc agcagcagca 
gcagcagcag caagaacact accagcagca 
tcaattcaac tcctcgtcct actcgtatat 

20 

tacaggctac caggacacca cctcctcaca 
cggcggtggc aacctgctaa acggcagctc 
25 gctgctcccc caggcggcca gctccagtgg 
gtcctccggt tccgtgggca atggcagcgg 
ctccggtccc ggcaatccca tgggcggcac 

30 

gatcgacttc aagcacctgt tcgaggagct 
ctaccactac ggcctgctca cctgcgagtc 
35 gaacaagaag gtctacacct gtgtggcgga 
caagcggtgt ccctactgcc gattccagaa 
tgttcgagcg gatagaatgc gtggtggacg 

40 

tcgcgcgcgg aagttgcaag tgatgcggca 
ctcgatgggt ccggacatca agccaacgcc 
45 aaatatgaac attaagcagg aaattcaaat 
ggactcgtcg cccagcccca tagcaattgc 
tgttatagcc acgcccatga acgccggcac 

50 

accaagttcc gtgggcaacg gcaatagcag 
cagcacgggc aacggaacgt ccggaggagg 
55 aggaaccaat tccaacgatg gcctgcatcg 
cgaggctgga ataggatctc tgcagaacac 
cacacatcca tcgagcacag ccgacgcgct 

60 

tcgtgaattt gtgcaatcga ttgacgatcg 
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attagaaatg 


gatcagcaac 


aggcgaccgt 


60 


gttcagcatg 


cagctggagc 


agcagcagca 


120 


caacagcagc 


aacaacgcgg 


ccagcggtag 


180 


cagcagcagc 


agcaacaaca 


acaacaataa 


240 


attcgagcac 


gaatacaatg 


cctacacgtt 


300 


cggcaatcag 


cagcaccaca 


gcaaccacag 


360 


gcaacaacag 


caacagcagc 


agcaacatca 


420 


acagcagcag 


aatatcgcca 


acaatgccaa 


480 


atacaatttc 


gattcacagt 


atatattccc 


540 


ctcgcaacag 


agcggaggag 


gcggtggcgg 


600 


cggcggcagc 


tccgccggcg 


gtggctacat 


660 


caataatggc 


aacccgaatg 


ccggccacat 


720 


aggcgctggc 


aatggcggag 


cgggtggcaa 


780 


gagcgccacg 


ccgggacacg 


gcggcgaggt 


840 


ttgccccgtg 


tgtggcgaca 


aggtgagcgg 


900 


ctgcaagggc 


ttcttcaagc 


gcaccgtgca 


960 


gcggtcgtgc 


cacatcgaca 


agacgcagcg 


1020 


gtgcctcgag 


gtgggcatga 


agctagaggc 


1080 


caacaaattc 


ggacccatgt 


acaaacggga 


1140 


gcggcagttg 


gcgctgcaag 


cgctgcgcaa 


1200 


gatctcgccg 


ggctaccagc 


aagcatatcc 


1260 


acctcaggta 


tcctcactca 


cccaatctcc 


1320 


gttgggacag 


gtgaacgcga 


gcacgggcgg 


1380 


tggcggcagt 


gggggcggtg 


gtctgaacgg 


1440 


caacggcagc 


agcaacggca 


acaacaacag 


1500 


ctyy uyy cdct u 


a a t" i^y t~y /~*t r~i r~r 


gcggaggagg 


1 CCA 


caacggcggc 


aatgacagca 


gcagttgcca 


1620 


ggccgactcg 


aaattgtgct 


tcgattctgg 


1680 


aatcgagcca 


ttaagagtct 


caccgatgat 


1740 


ggaatggcag 


acgcaactgt 


ttgccctgct 


1800 
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gcagaagcaa 


acctacaacc 


aggtggaagt 


ggatctcttc 


gagctgctga 


tgtgcaaagt 


1860 


gctcgaccag 


aatttgttct 


cgcaagtaga 


ctgggcacgg 


aacaccgtct 


tcttcaagga 


1920 


tctgaaggtc 


gacgaccaaa 


tgaagctgct 


gcagcattcc 


tggtcggaca 


tgcttgttct 


1980 


ggatcacctg 


catcatcgaa 


tccataacgg 


cctgcccgac 


gagacgcaac 


tgaacaatgg 


2040 


tcaggtgttc 


aatctgatga 


gtctgggttt 


gttgggagtg 


ccacaacccg 


gcgattactt 


2100 


caacgagctg 


cagaacaagc 


tgcaggacct 


gaaattcgat 


atgggcgact 


atgtctgcat 


2160 


gaaattccta 


atcctgttga 


atccaagtgt 


acggggtatt 


gtcaaccgga 


agaccgtatc 


2220 


cgagggacat 


gataatgtgc 


aagccgcttt 


gctggactac 


accctcacct 


gctatccgtc 


2280 


agtgaatgac 


aaattcagag 


ggctagttaa 


catcttaccg 


gaaatccatg 


ccatggccgt 


2340 


tcgcggcgag 


gatcacctga 


tcacctgtac 


accaagcact 


gtgccggcag 


tgcgcccacc 


2400 


caaacgctgc 


tcatggagat 


gctgcacgcc 


aagcgcaagg 


gatagaggtc 


gggagaacgt 


2460 


gacacggaat 


acttaatcat 


ttatgaaatg 


taaataacaa 


ggcgggaagg 


ccctcggggc 


2520 


aaccgggaca 


tggaaggcga 


acgaaggata 


cagcagaatt 


c 




2561 



<210> 32 

<211> 816 

<212> PRT 

<213> Drosophila melanogaster 

<400> 32 

Met Leu Leu Glu Met Asp Gin Gin 
1 5 

Ser Leu Asn lie Ser Pro Phe Ser 

20 

Pro Ser Ser Pro Ala Leu Ala Ala 
35 40 

Ala Ser Gly Ser Asn Asn Asn Ser 
50 55 

Ser Ser Asn Asn Asn Asn Asn Asn 
65 70 

Thr Lys Phe Glu His Glu Tyr Asn 

85 

Gly Gly Gly Ser Gly Ser Gly Asn 

100 

Asn His Gly Asn His His Gin Gin 

115 120 

Gin Gin His Gin Gin Gin Gin Gin 
130 135 



Gin Ala Thr Val 
10 

Met Gin Leu Glu 
25 

Gly Gly Asn Ser 



Ala Ser Gly Asn 

60 

Asn Asp Asn Asp 
75 

Ala Tyr Thr Leu 
90 

Gin Gin His His 
105 

Gin Gin Gin Gin 



Glu His Tyr Gin 

14 0 



Gin Phe lie Ser 
15 

Gin Gin Gin Gin 
30 

Ser Asn Asn Ala 
45 

Asn Thr Ser Ser 



Ala His Val Leu 

80 

Gin Leu Ala Gly 
95 

Ser Asn His Ser 
110 

Gin Gin Gin Gin 
125 

Gin Gin Gin Gin 
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Gin 
145 

Tyr 

5 

Asp 



10 Gly 



Gly 

15 

Gly 

225 

Ser 

20 

Asn 



25 He 



Lys 

30 

Gly 
305 

Ala 

35 

Tyr 



40 val 



Tyr 

45 

Leu 
385 

Thr 

50 

Lys 



55 Asp 



Ser 



Asn He Ala Asn Asn 

150 

lie Tyr Asn Phe Asp 

165 

Thr Thr Ser Ser His 
180 

Gly Gly Asn Leu Leu 
195 

Gly Tyr Met Leu Leu 
210 

Asn Pro Asn Ala Gly 

230 

Gly Gly Ala Gly Asn 

245 

Pro Met Gly Gly Thr 
260 

Asp Phe Lys His Leu 
275 

Val Ser Gly Tyr His 
290 

Phe Phe Lys Arg Thr 

310 

Glu Arg Ser Cys His 

325 

Cys Arg Phe Gin Lys 
340 

Arg Ala Asp Arg Met 
355 

Lys Arg Asp Arg Ala 
370 

Ala Leu Gin Ala Leu 

390 

Pro He Ser Pro Gly 

405 

Gin Glu He Gin He 
420 

Ser Ser Pro Ser Pro 
435 

Thr Gly Gly Val He 
450 



Ala Asn Gin Phe Asn 

155 

Ser Gin Tyr He Phe 

170 

Ser Gin Gin Ser Gly 
185 

Asn Gly Ser Ser Gly 
200 

Pro Gin Ala Ala Ser 
215 

His Met Ser Ser Gly 

235 

Gly Gly Ala Gly Gly 

250 

Ser Ala Thr Pro Gly 
265 

Phe Glu Glu Leu Cys 
280 

Tyr Gly Leu Leu Thr 
295 

Val Gin Asn Lys Lys 

315 

He Asp Lys Thr Gin 

330 

Cys Leu Glu Val Gly 
345 

Arg Gly Gly Arg Asn 
360 

Arg Lys Leu Gin Val 
375 

Arg Asn Ser Met Gly 

395 

Tyr Gin Gin Ala Tyr 

410 

Pro Gin Val Ser Ser 
425 

He Ala lie Ala Leu 
440 

Ala Thr Pro Met Asn 
455 



Ser Ser Ser Tyr Ser 

160 

Pro Thr Gly Tyr Gin 

175 

Gly Gly Gly Gly Gly 
190 

Gly Ser Ser Ala Gly 
205 

Ser Ser Gly Asn Asn 
220 

Ser Val Gly Asn Gly 

240 

Asn Ser Gly Pro Gly 

255 

His Gly Gly Glu Val 
270 

Pro Val Cys Gly Asp 
285 

Cys Glu Ser Cys Lys 
300 

Val Tyr Thr Cys Val 

320 

Arg Lys Arg Cys Pro 

335 

Met Lys Leu Glu Ala 
350 

Lys Phe Gly Pro Met 
365 

Met Arg Gin Arg Gin 

380 

Pro Asp He Lys Pro 

400 

Pro Asn Met Asn He 

415 

Leu Thr Gin Ser Pro 
430 

Gly Gin Val Asn Ala 
445 

Ala Gly Thr Gly Gly 
460 
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Ser Gly Gly Gly Gly Leu Asn Gly Pro Ser Ser Val Gly Asn Gly Asn 
465 470 475 480 

Ser Ser Asn Gly Ser Ser Asn Gly Asn Asn Asn Ser Ser Thr Gly Asn 

485 490 495 

Gly Thr Ser Gly Gly Gly Gly Gly Asn Asn Ala Gly Gly Gly Gly Gly 

500 505 510 

Gly Thr Asn Ser Asn Asp Gly Leu His Arg Asn Gly Gly Asn Asp Ser 
515 520 525 

Ser Ser Cys His Glu Ala Gly lie Gly Ser Leu Gin Asn Thr Ala Asp 
530 535 540 

Ser Lys Leu Cys Phe Asp Ser Gly Thr His Pro Ser Ser Thr Ala Asp 
545 550 555 560 

Ala Leu lie Glu Pro Leu Arg Val Ser Pro Met lie Arg Glu Phe Val 

565 570 575 

Gin Ser lie Asp Asp Arg Glu Trp Gin Thr Gin Leu Phe Ala Leu Leu 

580 585 590 

Gin Lys Gin Thr Tyr Asn Gin Val Glu Val Asp Leu Phe Glu Leu Leu 
595 600 605 

Met Cys Lys Val Leu Asp Gin Asn Leu Phe Ser Gin Val Asp Trp Ala 
610 615 620 

Arg Asn Thr Val Phe Phe Lys Asp Leu Lys Val Asp Asp Gin Met Lys 
625 630 635 640 

Leu Leu Gin His Ser Trp Ser Asp Met Leu Val Leu Asp His Leu His 

645 650 655 

His Arg lie His Asn Gly Leu Pro Asp Glu Thr Gin Leu Asn Asn Gly 

660 665 670 

■ 

Gin Val Phe Asn Leu Met Ser Leu Gly Leu Leu Gly Val Pro Gin Pro 
675 680 685 

Gly Asp Tyr Phe Asn Glu Leu Gin Asn Lys Leu Gin Asp Leu Lys Phe 
690 695 700 

Asp Met Gly Asp Tyr Val Cys Met Lys Phe Leu lie Leu Leu Asn Pro 
705 710 715 . 720 

Ser Val Arg Gly lie Val Asn Arg Lys Thr Val Ser Glu Gly His Asp 

725 730 735 

Asn Val Gin Ala Ala Leu Leu Asp Tyr Thr Leu Thr Cys Tyr Pro Ser 

740 745 750 

Val Asn Asp Lys Phe Arg Gly Leu Val Asn lie Leu Pro Glu lie His 
755 760 765 

Ala Met Ala Val Arg Gly Glu Asp His Leu lie Thr Cys Thr Pro Ser 
770 775 780 
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Thr Val Pro Ala Val Arg Pro Pro Lys Arg Cys Ser Trp Arg Cys Cys 
785 790 795 800 

Thr Pro Ser Ala Arg Asp Arg Gly Arg Glu Asn Val Thr Arg Asn Thr 
5 805 810 815 

<210> 33 

<211> 5970 

10 <212> DNA 

<213> Drosophila melanogaster 

<400> 33 





acttactagt 


gaaaaacatg 


ataataaaca 


acttgccaaa 


aaaaatccaa 


tgaaattgac 


60 


15 


acttatgtta 


aaaaaatagg 


tgagattgta 


accgttgatg 


tacacttacg 


aagtacgtaa 


120 




caagttcatg 


aactgatttc 


gtgagcaggt 


ctctccataa 


tcgccgtatc 


tgtgggatcg 


180 


20 


cgcgctcctg 


ctcgcactcg 


ctgggtggat 


ggcagcacat 


gttcgaagtg 


cgagagagtg 


240 


caaagcggag 


agcgccgacg 


tcgacgccga 


aaaaactgaa 


caagatccgc 


cgcgaatgtt 


300 




gattttcctt 


tcattgacta 


actgccactc 


gcagcgcgca 


gategtegge 


tccgcttgtt 


360 


25 


ccgttccgtt 


cgtttcgttt 


cgtttcgttc 


gatctacttc 


gagtcgegag 


ttttaagcag 


420 




tgtagtgagt 


gccccgtgaa 


aaggataacc 


caaaaagtga 


tttctactat 


t t tccaatag 


480 


30 


tttttatcag 


tgtgaagaaa 


acatgtaaac 


ttggctcaaa 


aagggcttta 


aaagatacaa 


540 


agcttcaatg 


cgaaggataa 


aataatatcg 


caccagtgct 


tcaaaaacca 


aaactatgee 


600 




taaggctgga 


aatttaaatt 


aaaatttttt 


taataaatat 


tccaaaaata 


ttgcccctga 


660 


35 


aaagtgttga 


taaaccccca 


accgagcaaa 


atgttaatgt 


ccgcggacag 


ttcagatagc 


720 




gccaagactt 


ctgtgat ctg 


cagcacggtg 


agtgccagca 


tgctagcacc 


accagctcca 


780 


40 


gaacagccca 


gcaccacagc 


accacccatt 


ttgggggtaa 


caggtcgatc 


tcacctggaa 


840 


aatgccctga 


aactaccgcc 


aaacacaagt 


gtttcggctt 


actaccagca 


caacagcaag 


900 








cpi Pi t~ ccaciPi pi 
v»» c* c* v»» ^- ^— y y a. a. 


t - t~ c*Pi era pi arr 


1— y C* CI. ^ 


t~ a 1~ r* Pi c Pi cip\ t~ 


960 

S \J \J 


45 


ctggatactg 


tgccacccac 


aggtgtgacc 


atggcgagtt 


cttcgaattc 


tcccaactcc 


1020 




tccgtcaagc 


tgccccacag 


cggcgtgatc 


tttgtcagca 


aatcgagtgc 


cgtcagcacc 


1080 


50 


accgatggtc 


ccactgcagt 


gttgcaacag 


cagcagccgc 


agcagcaaat 


gccccagcac 


1140 


ttcgagtccc 


tgccccacca 


ccacccccag 


caggaacacc 


agccacagca 


gcagcagcaa 


1200 




caacatcacc 


ttcagcacca 


cccacatcca 


catgtgatgt 


atccgcacgg 


atatcagcag 


1260 


55 


gccaatctgc 


accactcggg 


tggtattgct 


gtggttccgg 


eggattcgeg 


tccccagact 


1320 




cccgagtaca 


tcaagtccta 


cccagttatg 


gatacaactg 


tggctagttc 


ggtaaagggg 


1380 


60 


gaaccagaac 


tcaacataga 


attcgatggc 


accacagtgc 


tgtgccgcgt 


ttgeggggat 


1440 


aaggcctccg 


gtttccatta 


cggcgtgcat 


tectgegagg 


gttgcaaggg 


attcttccgc 


1500 
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cgctccatcc 


agcaaaagat 


ccagtatcgc 


5 


ctgcgcatca 


ategcaateg 


ttgtcaatat 




atgagtcgcg 


atgctgtgcg 


ttttggacgc 




gcggccauyc 


— \ ~\ s—* —, /—i -™\ 

daCayay CdL 


✓*» /"I ~} »**T S a I - /~*/~T/~l 

^LdyddL. Cyt, 


10 


gatgaccagc 


cacgcctcct 


cgccgccgtg 




accaaggaga 


aggtctegge 


gatgeggcag 


15 


cccacacttc 


tggcctgtcc 


gctgaacccc 




tcgcagcgtt 


tcgcccacgt 


aattcgegge 




4— 4— « ^ /^-r +— /— * /^i 

c t ccagc tyc 


c cacccagga 


egauadyce c 


20 


ctgtttgtgc 


gectgatctg 


catgtttgac 




ggccaggtga 


tgcgacggga 


tgcgatccag 


25 


tccaccttca 


atttcgegga 


gegcatgaac 




ctgttctgcg 


ccatcgttct 


gattacgecg 




accgagaaga 


uguactcgcg 


dcccddyyyc 


30 


cccgatcagc 


ccgagttcct 


ggccaagttg 




agcaccctgc 


acaccgagaa 


actggtagtt 


35 


cagcagatgt 


ggtccatgga 


ggaeggcaac 




tcgggcagct 


gggcggatgc 


catggacgtg 




c cgagca c eg 


ay l CCyccya 


cc egg aC laC 


40 


gtgtctctgc 


cctcgccgcc 


tcagcaacag 




ctggcggcca 


ccctctccgg 


aggatgtccc 


45 


ggtgactccg 


gagcagctga 


gatggatatc 




gggctgacaa 


tcacgccgat 


tgtgcgacac 




gga acacu ca 


^ +— -3 s-t ft /~\ -73 

aUaaCycyCa 


LLCCCgCadC 


50 


cagcagcagc 


acccacaact 


gcaccaccac 




etagattege 


ccacggattc 


gggcattgag 


55 


gtgagttcgg 


ggggaagttc 


ctcgtgctcc 




gaetgeageg 


atgccgccgc 


caatcacaat 




gtgtccgtgt 


caccagttcg 


ctcgccccag 


60 


attgtggagg 


atatgecegt 


getgaagege 



ccgtgcacca 


agaatcagca 


gtgeagcatt 


1560 


tgccgcctga 


aaaagtgcat 


tgccgtgggc 


1620 


gtgecgaage 


gegaaaagge 


gcgtatcctg 


1680 


ggccagcagc 


gagccctcgc 


caccgagctg 


1740 


ctgcgcgccc 


acctcgagac 


ctgtgagttc 


1800 


egggegeggg 


attgcccctc 


ctactccatg 


1860 


gcccctgaac 


tgcaatcgga 


gcaggagttc 


1920 


gtgatcgact 


ttgeeggcat 


gattcccggc 


1980 


acgctcctga 


aggegggact 


cttcgacgcc 


2040 


tegtcgataa 


actcaatcat 


ctgtctaaat 


2100 


aaeggageca 


atgcccgctt 


cctggtggac 


2160 


tcgatgaacc 


tgacagatgc 


cgagataggc 


2220 


gatcgccccg 


gtttgcgcaa 


cctggagctg 


2280 


tgcctgcagt 


acattgtege 


ccagaatagg 


2340 


ctggagacga 


tgcccgatct 


gcgcaccctg 


2400 


ttccgcaccg 


agcacaagga 


gctgctgcgc 


2460 


aacagegatg 


gecagcagaa 


caagtcgccc 


2520 


gaggeggeca 


agagtccget 


tggctcggta 


2580 


ggcagtccga 


geagttcgea 


gccacagggc 


2640 


ccctcggctc 


tggccagctc 


ggctcctctg 


2700 


ctgcgcaacc 


gggecaatte 


cggctccagc 


2760 


gttggctcgc 


acgcacatct 


cacccagaac 


2820 


cagcagcagc 


aacaacagca 


gcagcagatc 


2880 


ttgaatgggg 


gaeacgegat 


gtgccagcaa 


2940 


ttgacagccg 


gagctgcccg 


ctacagaaag 


3000 


tegggcaacg 


agaagaacga 


gtgcaaggcg 


3060 


agtccgcgtt 


ccagtgtgga 


tgatgcgctg 


3120 


caggtggtgc 


agcatccgca 


gctgagtgtg 


3180 


ccctccacca 


gcagccatct 


gaagcgacag 


3240 


gtgetgeagg 


ctccccctct 


gtacgatacc 


3300 
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aactcgctga 


tggacgaggc 


ctacaagccg 


cacaagaaat 


tccgggccct 


gcggcatcgc 


3360 




gagttcgaga 


ccgccgaggc 


ggatgccagc 


agttccactt 


ccggctcgaa 


cagcctgagt 


3420 


5 


gccggcagtc 


cgcggcagag 


cccagtcccg 


aacagtgtgg 


ccacgccccc 


gccatcggcg 


3480 




gccagcgccg 


ccgcaggtaa 


tcccgcccag 


agccagctgc 


acatgcacct 


gacccgcagc 


3540 


10 


agccccaagg 


cctcgatggc 


cagctcgcac 


tcggtgctgg 


ccaagtctct 


catggccgag 


3600 


ccgcgcatga 


cgcccgagca 


gatgaagcgc 


agcgatatta 


tccaaaacta 


cttgaagcgc 


3660 




gagaacagca 


cagcagccag 


cagcaccacc 


aatggcgtgg 


gcaaccgcag 


tcccagcagc 


3720 


15 


agctccacac 


cgccgccgtc 


ggcggtccag 


aatcagcagc 


gttggggcag 


cagctcggtg 


3780 




atcaccacca 


cctgccagca 


gcgccagcag 


tccgtgtcgc 


cgcacagcaa 


cggttccagc 


3840 


20 


tccagttcga 


gctctagctc 


cagctccagt 


tcgtcatcct 


cctccacatc 


ctccaactgc 


3900 


agctccagct 


cggccagcag 


ctgccagtat 


ttccagtcgc 


cgcactccac 


cagcaacggc 


3960 




accagtgcac 


cggcgagctc 


cagttcggga 


tcgaacagcg 


ccacgcccct 


gctggaactg 


4020 


25 


caggtggaca 


ttgctgactc 


ggcgcagcct 


ctcaatttgt 


ccaagaaatc 


gcccacgccg 


4080 




ccgcccagca 


agctgcacgc 


tctggtggcc 


gccgccaatg 


ccgttcaaag 


gtatcccaca 


4140 


30 


ttgtccgccg 


acgtcacagt 


gacagcctcc 


aatggcgggt 


cctccgtcgg 


cggcggcgag 


4200 


tccggccgcc 


agcagcagtc 


cgccggcgag 


tgtgggctcc 


cccaatccgg 


gcctgagcgc 


4260 




cgccgtgcac 


aaggtaatgc 


tggaggcgta 


agagcgggag 


gaggtaggtg 


gttttacgcg 


4320 


35 


gagaagtggg 


agagacagag 


actgggagtg 


gcagttcagc 


gaagcaggaa 


gcaggatcac 


4380 




ttggagcggc 


gggagttgaa 


ttaaattatt 


ttaccattta 


attgagacgt 


gtacaaagtt 


4440 


40 


tgaaagcaaa 


accaacatgc 


atgcaattta 


aaactaatat 


ttaaagcaac 


aacaaacaaa 


4500 


acaactacaa 


gttattaatt 


taaaaaacaa 


acaaacaaac 


aaacaacaaa 


aaacccaagc 


4560 




ttgaatggta 


ttacaaaaga 


aaaagaaaaa 


cagaaaaaat 


ataaatatat 


tttagcagtt 


4620 


45 


aaactttaac 


gtagcaagaa 


accaacaaac 


ccaaggcagc 


gctctgattt 


cgcat taact 


4680 




tttcttcagc 


tgctaccgaa 


aacgcccctc 


acctcccccc 


cacccaaccc 


ttcctccaca 


4740 


50 


caccaaccgt 


ctttcgaccc 


ctgattgttt 


tataagtttt 


aagctcttgt 


tgtacatatt 


4800 


aattacgttt 


attggtaact 


atgtttagcg 


ctttagttgt 


agttggagca 


aaactacttt 


4860 




gcttttttgg 


atgttttttg 


aaaaaactgc 


aaattattat 


tattaaattt 


ttaaatacct 


4920 


55 


aaaaacaaaa 


caatQtcrtqt 


gaaatttt tt 


at tgtgcgat 


ctccaagcag 


aataaaqtac 


4980 




agtttgcaac 


aaattttaac 


tacgattaag 


ttgataacga 


ttcatttttt 


atgaatttaa 


5040 


60 


ctaatt ttat 


gaatttgtta 


tagttttcca 


cccttctata 


gatctttcta 


tctgatcatc 


5100 


tagctacccg 


tattcctgat 


ttctcctttg 


gcacaaagct 


cttctctatg 


ctaaagaatc 


5160 
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15 



aagtggaata aatattgttt tctaatttta aaactaccac aaaaatacga ttaaaatata 5220 
cacgaagtaa ttgaaaatca aacaaaatgc ttaaagtttt agcagcaagc agtaaaacga 52 80 

5 

cgatgaagaa gagaaaccca acgttaaata tatctgttgt gtacatagtt aaatgttaaa 5340 
ttaaacacaa aaacatattt aaagtacata taaatacaca taattattaa tgaagaaacc 5400 
10 tatgcttaaa agattcaatg tttgattggc atcttagaaa accaagcgaa aaatacaaaa 5460 
aaaaatcaac aaacaaaaat tatgatatat tatttaaaag taaagtatac atttacatta 5520 
cagaaaaaca aaagagaaaa cttgcggtag caacaaaact attatattaa ttacatttta 5580 
attatgctgt actattatga ttattaatta ttatgattaa ttaattacga tttttatgct 5640 
tagacaaacc aacaaaaaac aaatatgcaa aaaccattaa aaaaaaaaac aaaaaacaag 5700 
20 caaaaaatta cactggcgca gaaatttgta ttgtcaaatc aagaaaaaat ttgtttaaaa 5760 
atttcaaaag aaagcctctt aagttttttc atttcaattt ccttttcagt tgaacacttt 5820 
ttcttaaaat gttcagtttt accgttactt tgctttggag tggtaagatt ttgggtttca 5880 

25 

cttgatttca ttttcgtttt gactatcgtg ctgtaaaaat atttactaaa ttatatcgaa 5940 
agtatttata tcataattaa ataaacaaga 5970 

30 

<210> 34 
<211> 1237 
<212> PRT 

<213> Drosophila melanogaster 
35 <400> 34 

Met Leu Met Ser Ala Asp Ser Ser Asp Ser Ala Lys Thr Ser Val lie 
15 10 15 

40 Cys Ser Thr Val Ser Ala Ser Met Leu Ala Pro Pro Ala Pro Glu Gin 

20 25 30 

Pro Ser Thr Thr Ala Pro Pro lie Leu Gly Val Thr Gly Arg Ser His 
35 40 45 



45 



Leu Glu Asn Ala Leu Lys Leu Pro Pro Asn Thr Ser Val Ser Ala Tyr 
50 55 60 



Tyr Gin His Asn Ser Lys Leu Gly Met Gly Gin Asn Tyr Asn Pro Glu 
50 65 70 75 80 

Phe Arg Ser Leu Val Ala Pro Val Thr Asp Leu Asp Thr Val Pro Pro 

85 90 95 

55 Thr Gly Val Thr Met Ala Ser Ser Ser Asn Ser Pro Asn Ser Ser Val 

100 105 110 



Lys Leu Pro His Ser Gly Val lie Phe Val Ser Lys Ser Ser Ala Val 
115 120 125 



60 
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Ser Thr Thr Asp Gly Pro Thr Ala Val Leu Gin Gin Gin Gin Pro Gin 
130 135 140 

Gin Gin Met Pro Gin His Phe Glu Ser Leu Pro His His His Pro Gin 
145 150 155 160 

Gin Glu His Gin Pro Gin Gin Gin Gin Gin Gin His His Leu Gin His 

165 170 175 

His Pro His Pro His Val Met Tyr Pro His Gly Tyr Gin Gin Ala Asn 

180 185 190 

Leu His His Ser Gly Gly lie Ala Val Val Pro Ala Asp Ser Arg Pro 
195 200 205 

Gin Thr Pro Glu Tyr lie Lys Ser Tyr Pro Val Met Asp Thr Thr Val 
210 215 220 

Ala Ser Ser Val Lys Gly Glu Pro Glu Leu Asn lie Glu Phe Asp Gly 
225 230 235 240 

Thr Thr Val Leu Cys Arg Val Cys Gly Asp Lys Ala Ser Gly Phe His 

245 250 255 

Tyr Gly Val His Ser Cys Glu Gly Cys Lys Gly Phe Phe Arg Arg Ser 

260 265 270 

lie Gin Gin Lys lie Gin Tyr Arg Pro Cys Thr Lys Asn Gin Gin Cys 
275 280 285 

Ser lie Leu Arg lie Asn Arg Asn Arg Cys Gin Tyr Cys Arg Leu Lys 
290 295 300 

Lys Cys lie Ala Val Gly Met Ser Arg Asp Ala Val Arg Phe Gly Arg 
305 310 315 320 

Val Pro Lys Arg Glu Lys Ala Arg lie Leu Ala Ala Met Gin Gin Ser 

325 330 335 

Thr Gin Asn Arg Gly Gin Gin Arg Ala Leu Ala Thr Glu Leu Asp Asp 

340 345 350 

Gin Pro Arg Leu Leu Ala Ala Val Leu Arg Ala His Leu Glu Thr Cys 
355 360 365 

Glu Phe Thr Lys Glu Lys Val Ser Ala Met Arg Gin Arg Ala Arg Asp 
370 375 380 

Cys Pro Ser Tyr Ser Met Pro Thr Leu Leu Ala Cys Pro Leu Asn Pro 
385 390 395 400 

Ala Pro Glu Leu Gin Ser Glu Gin Glu Phe Ser Gin Arg Phe Ala His 

405 410 415 

Val lie Arg Gly Val lie Asp Phe Ala Gly Met lie Pro Gly Phe Gin 

420 425 430 

Leu Leu Thr Gin Asp Asp Lys Phe Thr Leu Leu Lys Ala Gly Leu Phe 
435 440 445 
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Asp Ala Leu Phe 
450 

Ser lie He Cys 
465 

Asn Gly Ala Asn 



Glu Arg Met Asn 

500 

Cys Ala He Val 
515 

Glu Leu He Glu 
530 

He Val Ala Gin 
545 

Leu Glu Thr Met 



Lys Leu Val Val 

580 

Met Trp Ser Met 
595 

Ser Pro Ser Gly 
610 

Ser Pro Leu Gly 
625 

Gly Ser Pro Ser 



Pro Gin Gin Gin 

660 

Ala Thr Leu Ser 

675 

Ser Ser Gly Asp 
690 

Ala His Leu Thr 
705 

Gin Gin Gin Gin 



His Ser Arg Asn 

740 

Gin His Pro Gin 
755 



Val Arg Leu He 
455 

Leu Asn Gly Gin 
470 

Ala Arg Phe Leu 
485 

Ser Met Asn Leu 



Leu He Thr Pro 

520 

Lys Met Tyr Ser 
535 

Asn Arg Pro Asp 
550 

Pro Asp Leu Arg 

565 

Phe Arg Thr Glu 



Glu Asp Gly Asn 

600 

Ser Trp Ala Asp 
615 

Ser Val Ser Ser 
630 

Ser Ser Gin Pro 
64 5 

Pro Ser Ala Leu 



Gly Gly Cys Pro 

680 

Ser Gly Ala Ala 
695 

Gin Asn Gly Leu 
710 

Gin Gin Gin Gin 
725 

Leu Asn Gly Gly 



Leu His His His 

760 



Cys Met Phe Asp 

460 

Val Met Arg Arg 
475 

Val Asp Ser Thr 
490 

Thr Asp Ala Glu 
505 

Asp Arg Pro Gly 



Arg Leu Lys Gly 

540 

Gin Pro Glu Phe 
555 

Thr Leu Ser Thr 
570 

His Lys Glu Leu 
585 

Asn Ser Asp Gly 



Ala Met Asp Val 

620 

Thr Glu Ser Ala 
635 

Gin Gly Val Ser 
650 

Ala Ser Ser Ala 
665 

Leu Arg Asn Arg 



Glu Met Asp He 

700 

Thr He Thr Pro 
715 

Gin He Gly He 
730 

His Ala Met Cys 
745 

Leu Thr Ala Gly 



Ser Ser He Asn 



Asp Ala He Gin 

480 

Phe Asn Phe Ala 
495 

He Gly Leu Phe 
510 

Leu Arg Asn Leu 
525 

Cys Leu Gin Tyr 



Leu Ala Lys Leu 

560 

Leu His Thr Glu 
575 

Leu Arg Gin Gin 
590 

Gin Gin Asn Lys 
605 

Glu Ala Ala Lys 



Asp Leu Asp Tyr 

640 

Leu Pro Ser Pro 
655 

Pro Leu Leu Ala 
670 

Ala Asn Ser Gly 
685 

Val Gly Ser His 



He Val Arg His 

720 

Leu Asn Asn Ala 
735 

Gin Gin Gin Gin 
750 

Ala Ala Arg Tyr 
765 
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Arg Lys Leu Asp Ser Pro Thr Asp Ser Gly lie Glu Ser Gly Asn Glu 
770 775 780 

Lys Asn Glu Cys Lys Ala Val Ser Ser Gly Gly Ser Ser Ser Cys Ser 
785 790 795 800 

Ser Pro Arg Ser Ser Val Asp Asp Ala Leu Asp Cys Ser Asp Ala Ala 

805 810 815 

Ala Asn His Asn Gin Val Val Gin His Pro Gin Leu Ser Val Val Ser 

820 825 830 

Val Ser Pro Val Arg Ser Pro Gin Pro Ser Thr Ser Ser His Leu Lys 
835 840 845 

Arg Gin lie Val Glu Asp Met Pro Val Leu Lys Arg Val Leu Gin Ala 
850 855 860 

Pro Pro Leu Tyr Asp Thr Asn Ser Leu Met Asp Glu Ala Tyr Lys Pro 
865 870 875 880 

His Lys Lys Phe Arg Ala Leu Arg His Arg Glu Phe Glu Thr Ala Glu 

885 890 895 

Ala Asp Ala Ser Ser Ser Thr Ser Gly Ser Asn Ser Leu Ser Ala Gly 

900 905 910 

Ser Pro Arg Gin Ser Pro Val Pro Asn Ser Val Ala Thr Pro Pro Pro 
915 920 925 

Ser Ala Ala Ser Ala Ala Ala Gly Asn Pro Ala Gin Ser Gin Leu His 
930 935 940 

Met His Leu Thr Arg Ser Ser Pro Lys Ala Ser Met Ala Ser Ser His 
945 950 955 960 

Ser Val Leu Ala Lys Ser Leu Met Ala Glu Pro Arg Met Thr Pro Glu 

965 970 975 

Gin Met Lys Arg Ser Asp lie lie Gin Asn Tyr Leu Lys Arg Glu Asn 

980 985 990 

Ser Thr Ala Ala Ser Ser Thr Thr Asn Gly Val Gly Asn Arg Ser Pro 
995 1000 1005 

Ser Ser Ser Ser Thr Pro Pro Pro Ser Ala Val Gin Asn Gin Gin 
1010 1015 1020 

Arg Trp Gly Ser Ser Ser Val lie Thr Thr Thr Cys Gin Gin Arg 
1025 1030 1035 

Gin Gin Ser Val Ser Pro His Ser Asn Gly Ser Ser Ser Ser Ser 
1040 1045 1050 

Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Thr Ser Ser 
1055 1060 1065 

Asn Cys Ser Ser Ser Ser Ala Ser Ser Cys Gin Tyr Phe Gin Ser 
1070 1075 1080 
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Pro His Ser Thr Ser Asn 
1085 

Ser Gly Ser Asn Ser Ala 
1100 

lie Ala Asp Ser Ala Gin 
1115 

Thr Pro Pro Pro Ser Lys 

1130 

Ala Val Gin Arg Tyr Pro 
1145 

Ala Ser Asn Gly Gly Ser 
1160 

Gin Gin Gin Ser Ala Gly 
1175 

Glu Arg Arg Arg Ala Gin 
1190 

Gly Gly Arg Trp Phe Tyr 
1205 

Gly Val Ala Val Gin Arg 
1220 

Arg Glu Leu Asn 
1235 
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Gly Thr Ser 
1090 

Thr Pro Leu 
1105 

Pro Leu Asn 
1120 

Leu His Ala 

1135 

Thr Leu Ser 
1150 

Ser Val Gly 
1165 

Glu Cys Gly 
1180 

Gly Asn Ala 
1195 

Ala Glu Lys 
1210 

Ser Arg Lys 
1225 



Ala Pro Ala 
1095 

Leu Glu Leu 
1110 

Leu Ser Lys 
1125 

Leu Val Ala 

1140 

Ala Asp Val 
1155 

Gly Gly Glu 
1170 

Leu Pro Gin 
1185 

Gly Gly Val 
1200 

Trp Glu Arg 
1215 

Gin Asp His 
1230 



Ser Ser Ser 
Gin Val Asp 
Lys Ser Pro 
Ala Ala Asn 
Thr Val Thr 
Ser Gly Arg 
Ser Gly Pro 
Arg Ala Gly 
Gin Arg Leu 
Leu Glu Arg 
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39 


aatgtctgca ctgcgatttg tcgctgt 


<210> 


40 
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40 
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<400> 41 

attgtaagtc tggctctgca agagccc 



<210> 42 
<211> 28 
<212> DNA 

<213> Heliothis virescens 
<400> 42 

cgtgagagac gacacttgag ggatctgt 
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27 



28 



<210> 43 

<211> 462 

<212> PRT 

<213> Bombyx mori 

<400> 43 

Met Ser Ser Val Ala Lys Lys Asp Lys Arg Thr Met Ser Val Thr Ala 
15 10 15 

Leu lie Asn Arg Ala Trp Pro Met Thr Pro Ser Pro Gin Gin Gin Gin 

20 25 30 

Gin Met Val Pro Ser Thr Gin His Ser Asn Phe Leu His Ala Met Ala 
35 40 45 

Thr Pro Ser Thr Thr Pro Asn Val Glu Leu Asp lie Gin Trp Leu Asn 
50 55 60 

He Glu Ser Gly Phe Met Ser Pro Met Ser Pro Pro Glu Met Lys Pro 
65 70 75 80 

Asp Thr Ala Met Leu Asp Gly Phe Arg Asp Asp Ser Thr Pro Pro Pro 

85 90 95 

Pro Phe Lys Asn Tyr Pro Pro Asn His Pro Leu Ser Gly Ser Lys His 

100 105 110 

Leu Cys Ser He Cys Gly Asp Arg Ala Ser Gly Lys His Tyr Gly Val 
115 120 125 

Tyr Ser Cys Glu Gly Cys Lys Gly Phe Phe Lys Arg Thr Val Arg Lys 
130 135 140 

Asp Leu Thr Tyr Ala Cys Arg Glu Asp Lys Asn Cys He He Asp Lys 
145 150 155 160 

Arg Gin Arg Asn Arg Cys Gin Tyr Cys Arg Tyr Gin Lys Cys Leu Ala 

165 170 175 

Cys Gly Met Lys Arg Glu Ala Val Gin Glu Glu Arg Gin Arg Ala Ala 

180 185 190 

Arg Arg Thr Glu Asp Ala His Pro Ser Ser Ser Val Gin Glu Leu Ser 
195 200 205 

He Glu Arg Leu Leu Glu Leu Glu Ala Leu Val Ala Asp Ser Ala Glu 
210 215 220 
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Glu Leu Gin He Leu Arg Val Gly Pro Glu Ser Gly Val Pro Ala Lys 

225 230 235 240 

Tyr Arg Ala Pro Val Ser Ser Leu Cys Gin He Gly Asn Lys Gin He 

245 250 255 

Ala Ala Leu lie Val Trp Ala Arg Asp He Pro His Phe Gly Gin Leu 

260 265 270 



Glu He Asp Asp 
275 

Leu Leu Phe Ala 
290 

Arg Glu Asn Val 
305 

Leu Met Pro Gly 



Val Gly Gin He 

340 

Arg Ser Leu Arg 
355 

He Leu Leu Asn 
370 

Asp Val Leu Arg 
385 

Arg Ser Arg Gly 



Leu Pro Ala Leu 

420 

Leu Phe His Leu 
435 

Ala Leu Cys Asn 
450 



Gin He Leu Leu 

280 

He Ala Trp Arg 
295 

Asp Ser Arg Asn 
310 

Met Thr Leu His 
325 

Phe Asp Arg Val 



Met Asp Gin Ala 

360 

Pro Asp Val Lys 
375 

Glu Lys Met Phe 
390 

Gly Glu Glu Gly 
405 

Arg Ser He Ser 



Val Ala Glu Gly 

440 

His Ala Pro Pro 
455 



He Lys Gly Ser 



Ser Met Glu Phe 

300 

Thr Ala Pro Pro 
315 

Arg Asn Ser Ala 
330 

Leu Ser Glu Leu 
345 

Glu Cys Val Ala 



Gly Leu Lys Asn 

380 

Leu Cys Leu Asp 
395 

Arg Phe Ala Ala 
410 

Leu Lys Ser Phe 
425 

Ser Val Ser Ser 



He Asp Thr Asn 

460 



Trp Asn Glu Leu 
285 

Leu Asn Asp Glu 



Gin Leu He Cys 

320 

Leu Gin Ala Gly 
335 

Ser Leu Lys Met 
350 

Leu Lys Ala He 
365 

Lys Gin Glu Val 



Glu Tyr Cys Arg 

400 

Leu Leu Leu Arg 
415 

Glu His Leu Tyr 
430 

Tyr He Arg Asp 
445 

He Met 



<210> 44 
<211> 461 
<212> PRT 

<213> Manduca sexta 
<400> 44 

Met Ser Ser Val Ala Lys Lys Asp Lys Arg Thr Met Ser Val Thr Ala 
15 10 15 

Leu He Asn Arg Ala Trp Pro Leu Thr Pro Ala Pro His Gin Gin Gin 

20 25 30 

Ser Met Pro Ser Ser Gin Pro Ser Asn Phe Leu Glh Pro Leu Ala Thr 
35 40 45 
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Pro Ser Thr Thr 
50 

Glu Pro Gly Phe 
65 

Thr Ala Met Leu 



Phe Lys Asn Tyr 

100 

Cys Ser lie Cys 
115 

Ser Cys Glu Gly 
130 

Leu Thr Tyr Ala 
145 

Gin Arg Asn Arg 



Gly Met Lys Arg 

180 

Gly Thr Glu Asp 
195 

Glu Arg Leu Leu 
210 

Phe Gin Phe Leu 
225 

Arg Ala Pro Val 



Ala Leu Val Val 

260 

Leu Glu Asp Gin 
275 

Leu Phe Ala He 
290 

Glu Asn Val Asp 
305 

Met Pro Gly Met 



Gly Gin He Phe 

340 

Thr Leu Arg Met 
355 



Pro Ser Val Glu 
55 

Met Ser Pro Met 
70 

Asp Gly Leu Arg 
85 

Pro Pro Asn His 



Gly Asp Arg Ala 

120 

Cys Lys Gly Phe 
135 

Cys Arg Glu Asp 
150 

Cys Gin Tyr Cys 
165 

Glu Ala Val Gin 



Ala His Pro Ser 

200 

Glu He Glu Ser 

215 

Arg Val Gly Pro 
230 

Ser Ser Leu Cys 
245 

Trp Ala Arg Asp 



He Leu Leu He 

280 

Ala Trp Arg Ser 
295 

Ser Arg Ser Thr 
310 

Thr Leu His Arg 
325 

Asp Arg Val Leu 



Asp Gin Ala Glu 

360 



Leu Asp He Gin 

60 

Ser Pro Pro Glu 
75 

Asp Asp Ser Thr 
90 

Pro Leu Ser Gly 
105 

Ser Gly Lys His 



Phe Lys Arg Thr 

140 

Arg Asn Cys He 
155 

Arg Tyr Gin Lys 
170 

Glu Glu Arg Gin 
185 

Ser Ser Val Gin 



Leu Val Ala Asp 

220 

Glu Ser Gly Val 
235 

Gin He Gly Asn 
250 

He Pro His Phe 
265 

Lys Asn Ser Trp 



Met Glu Tyr Leu 

300 

Ala Pro Pro Gin 
315 

Asn Ser Ala Leu 
330 

Ser Glu Leu Ser 
345 

Tyr Val Ala Leu 



Trp Leu Asn He 



Met Lys Pro Asp 

80 

Pro Pro Pro Ala 
95 

Ser Lys His Leu 
110 

Tyr Gly Val Tyr 
125 

Val Arg Lys Asp 



He Asp Lys Arg 

160 

Cys Leu Ala Cys 
175 

Arg Ala Ala Arg 
190 

Glu Leu Ser He 
205 

Pro Pro Glu Glu 



Pro Ala Lys Tyr 

240 

Lys Gin He Ala 
255 

Gly Gin Leu Glu 
270 

Asn Glu Leu Leu 
285 

Thr Asp Glu Arg 



Leu Met Cys Leu 

320 

Gin Ala Gly Val 

335 

Leu Lys Met Arg 
350 

Lys Ala He He 
365 
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Leu Leu Asn Pro 
370 

Val Leu Arg Glu 
385 

Ser Arg Cys Ala 



Pro Ala Leu Arg 

420 

Phe His Leu Val 
435 

Leu Arg Asn His 
450 



Asp Val Lys Gly 
375 

Lys Met Phe Ser 
390 

Glu Glu Gly Arg 
405 

Ser lie Ser Leu 



Ala Asp Thr Ser 

440 

Ala Pro Ser lie 
455 



Leu Lys Asn Lys 

380 

Cys Leu Asp Glu 
395 

Phe Ala Ala Leu 
410 

Lys Cys Phe Glu 
425 

lie Ala Ser Tyr 



Asp Thr Ser lie 

460 



Pro Glu Val Val 



Tyr Val Arg Arg 

400 

Leu Leu Arg Leu 
415 

His Leu Tyr Phe 
430 

lie His Asp Ala 
445 

Leu 



<210> 45 
<211> 472 
<212> PRT 

<213> Choristoneura fumiferana 
<400> 45 

Met Ser Ser Val Ala Lys Lys Asp Lys Pro Thr Met Ser Val Thr Ala 
15 10 15 

Leu lie Asn Trp Ala Arg Pro Ala Pro Pro Gly Pro Pro Gin Pro Gin 

20 25 30 

Ser Ala Ser Pro Ala Pro Ala Ala Met Leu Gin Gin Leu Pro Thr Gin 
35 40 45 

Ser Met Gin Ser Leu Asn His lie Pro Thr Val Asp Cys Ser Leu Asp 
50 55 60 

Met Gin Trp Leu Asn Leu Glu Pro Gly Phe Met Ser Pro Met Ser Pro 
65 70 75 80 

Pro Glu Met Lys Pro Asp Thr Ala Met Leu Asp Gly Leu Arg Asp Asp 

85 90 95 

Ala Thr Ser Pro Pro Asn Phe Lys Asn Tyr Pro Pro Asn His Pro Leu 

100 105 110 

Ser Gly Ser Lys His Leu Cys Ser lie Cys Gly Asp Arg Ala Ser Gly 
115 120 125 

Lys His Tyr Gly Val Tyr Ser Cys Glu Gly Cys Lys Gly Phe Phe Lys 
130 135 140 

Arg Thr Val Arg Lys Asp Leu Ser Tyr Ala Cys Arg Glu Glu Arg Asn 
145 150 155 160 

Cys lie lie Asp Lys Arg Gin Arg Asn Arg Cys Gin Tyr Cys Arg Tyr 

165 170 175 

Gin Lys Cys Leu Ala Cys Gly Met Lys Arg Glu Ala Val Gin Glu Glu 

180 185 190 
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Arg Gin Arg Asn 
195 

Val Gin Val Ser 

210 

Ser Leu Val Ala 
225 

Pro Asp Ser Asn 



Cys Gin lie Gly 

260 

Asp lie Pro His 
275 

lie Lys Ala Ser 
290 

Ser Met Glu Tyr 
305 

Ser Thr Thr Gin 



His Arg Asn Ser 

340 

Val Leu Ser Glu 
355 

Ala Glu Tyr Val 
370 

Lys Gly Leu Lys 
385 

Phe Ser Cys Leu 



Gly Arg Phe Ala 

420 

Ser Leu Lys Ser 
435 

Gly Ser lie Ser 
450 

Pro lie Asp Val 
465 



Ala Arg Gly Ala 

200 

Asp Glu Leu Ser 
215 

Asp Pro Ser Glu 
230 

Val Pro Pro Arg 
245 

Asn Lys Gin lie 



Phe Gly Gin Leu 

280 

Trp Asn Glu Leu 
295 

Leu Glu Asp Glu 
310 

Pro Gin Leu Met 
325 

Ala Gin Gin Ala 



Leu Ser Leu Lys 

360 

Ala Leu Lys Ala 
375 

Asn Arg Gin Glu 
390 

Asp Asp Tyr Cys 
405 

Ser Leu Leu Leu 



Phe Glu His Leu 

440 

Gly Tyr lie Arg 
455 

Asn Ala Met Met 
470 



Glu Asp Ala His 



lie Glu Arg Leu 

220 

Glu Phe Gin Phe 
235 

Tyr Arg Ala Pro 
250 

Ala Ala Leu Val 
265 

Glu Leu Asp Asp 



Leu Leu Phe Ala 

300 

Arg Glu Asn Gly 
315 

Cys Leu Met Pro 
330 

Gly Val Gly Ala 
345 

Met Arg Thr Leu 



lie Val Leu Leu 

380 

Val Asp Val Leu 
395 

Arg Arg Ser Arg 
410 

Arg Leu Pro Ala 
425 

Tyr Phe Phe His 



Glu Ala Leu Arg 

460 



Pro Ser Ser Ser 
205 

Thr Glu Met Glu 



Leu Arg Val Gly 

240 

Val Ser Ser Leu 
255 

Val Trp Ala Arg 
270 

Gin Val Val Leu 
285 

lie Ala Trp Arg 



Asp Gly Thr Arg 

320 

Gly Met Thr Leu 
335 

lie Phe Asp Arg 
350 

Arg Met Asp Gin 
365 

Asn Pro Asp Val 



Arg Glu Lys Met 

400 

Ser Asn Glu Glu 
415 

Leu Arg Ser lie 
430 

Leu Val Ala Glu 
445 

Asn His Ala Pro 



<210> 46 

<211> 389 

<212> PRT 

<213> Locusta migratoria 

<400> 46 
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Met Glu Gly Ser 
1 

Ser Ser Met Gly 

20 

Ser Leu lie Ser 
35 

Pro Gly Ser Phe 
50 

Ser Asn Gin Ala 
65 

Leu Ser Gly Ser 



Gly Lys His Tyr 

100 

Lys Arg Thr Val 
115 

Asn Cys lie lie 
130 

Tyr Gin Lys Cys 
145 

Glu Arg Gin Arg 



Ser Ser Leu His 

180 

Lys Arg Val Glu 
195 

Glu Trp Ala Lys 
210 

Gin Val Leu Leu 
225 

Phe Ser His Arg 



Gly Leu Thr Val 

260 

lie Phe Asp Arg 
275 

Lys Met Asp Lys 
290 

Asn Pro Glu Val 
305 



Glu Arg Gly lie 
5 

Pro Gin Ser Pro 



Ser Gly Ser Phe 

40 

Thr lie Gly His 
55 

Lys Gly Ser Ser 
70 

Lys His Leu Cys 
85 

Gly Val Tyr Ser 



Arg Lys Asp Leu 

120 

Asp Lys Arg Gin 
135 

Leu Ala Met Gly 
150 

Thr Lys Glu Arg 
165 

Thr Asp Met Pro 



Cys Lys Ala Glu 

200 

His lie Pro His 
215 

Leu Arg Ala Gly 
230 

Ser Val Asp Val 
245 

His Arg Asn Ser 



Val Leu Thr Glu 

280 

Thr Glu Leu Gly 
295 

Arg Gly Leu Lys 
310 



Ser Leu Glu Asn 
10 

Leu Asp Met Lys 
25 

Ser Pro Thr Gly 



Ser Ser Leu Leu 

60 

Ser Gin Tyr Pro 
75 

Ser lie Cys Gly 
90 

Cys Glu Gly Cys 
105 

Ser Tyr Ala Cys 



Arg Asn Arg Cys 

140 

Met Lys Arg Glu 
155 

Asp Gin Asn Glu 
170 

Val Glu Arg lie 
185 

Asn Gin Val Glu 



Phe Thr Ser Leu 

220 

Trp Asn Glu Leu 
235 

Lys Asp Gly lie 
250 

Ala His Gin Ala 
265 

Leu Val Ala Lys 



Cys Leu Arg Ser 

300 

Ser Ala Gin Glu 
315 



Asn Leu Ser lie 
15 

Pro Asp Thr Ala 
30 

Gly Pro Asn Ser 
45 

Asn Asn Ser Ser 



Pro Asn His Pro 

80 

Asp Arg Ala Ser 
95 

Lys Gly Phe Phe 
110 

Arg Glu Asp Lys 
125 

Gin Tyr Cys Arg 



Ala Val Gin Glu 

160 

Val Glu Ser Thr 
175 

Leu Glu Ala Glu 
190 

Tyr Glu Leu Val 
205 

Pro Leu Glu Asp 



Leu lie Ala Ala 

240 

Val Leu Ala Thr 
255 

Gly Val Gly Thr 
270 

Met Arg Glu Met 
285 

Val lie Leu Phe 



Val Glu Leu Leu 

320 
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Arg Glu Lys Val 



Pro Asp Glu Pro 

340 

Leu Arg Ser lie 
355 

Leu lie Gly Asp 
370 

Ser Pro Ser Asp 
385 



Tyr Ala Ala Leu 
325 

Gly Arg Phe Ala 



Gly Leu Lys Cys 

360 

Val Pro lie Asp 
375 

Ser 



Glu Glu Tyr Thr 
330 

Lys Leu Leu Leu 
345 

Leu Glu His Leu 



Thr Phe Leu Met 

380 



Arg Thr Thr His 
335 

Arg Leu Pro Ser 
350 

Phe Phe Phe Arg 
365 

Glu Met Leu Glu 



<210> 47 
<211> 552 
<212> PRT 

<213> Chironomus tentans 
<400> 47 

Met Leu Lys Lys Glu Lys Pro Met Met Thr Val Ala Ala lie lie Glu 
15 10 15 

Gin Ala Gin Asn Arg Trp Met Asp His Pro Leu Val Tyr Asn Ser Arg 

20 25 30 

Ser Leu Gin Phe Gin Gly Ser Tyr Cys lie Asp Ser Ser Leu Leu Gly 
35 40 45 

His Met Gly Pro Leu Ser Pro Pro Asp Leu Lys Pro Asp lie Ser Leu 
50 55 60 

Leu Asn Cys Asn Asn Asn Asn Asn Asn Thr Asn Asn Asn Asn Ser Asn 
65 70 75 80 

Ser Ser His Asn Asn Leu Asn His His Asn Thr Ser Pro Leu Pro Val 

85 90 95 

Leu Gly Ala Asn Thr Phe Ser Pro lie Gin Ser Leu Asn Asn Asn Gly 

100 105 110 

Pro Ser Ser Pro Leu Ser Ser lie Gly Asn Gly Ser Gly Thr lie Val 
115 120 125 

Thr Phe Asn Gin lie Lys Leu Gin Ser Pro Ser Pro Ser Asn Ala Ser 
130 135 140 

Ser Ser Ser Thr Leu Ser Gly Pro Leu Thr Thr Thr Pro Pro Ala Thr 
145 150 155 160 

Asn Ala Asn Asn lie Leu Gly Met Gly Asn Gly Asn Cys Gly Asn Thr 

165 170 175 

Ala Asn Gly Lys Gin Ser Gin Tyr Pro Pro Asn His Pro Leu Ser Gly 

180 185 190 

Ser Lys His Leu Cys Ser lie Cys Gly Asp Arg Ala Ser Gly Lys His 
195 200 205 
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Tyr Gly Val Tyr Ser Cys Glu Gly Cys Lys Gly Phe Phe Lys Arg Thr 
210 215 220 

Val Arg Lys Asp Leu Ser Tyr Ala Cys Arg Glu Glu Arg Asn Cys Val 
225 230 235 240 

lie Asp Lys Lys Gin Arg Asn Arg Cys Gin Tyr Cys Arg Tyr Gin .Lys 

245 250 255 

Cys Leu Asn Cys Gly Met Lys Arg Glu Ala Val Gin Glu Glu Arg Gin 

260 265 270 

Arg Gly Gly Lys Ser Gin Lys Gly Asp Asp Met Ser lie Ser Ser Thr 
275 280 285 

Gin Ser Leu Val Asn Asn Gly Pro Gly Arg Asp lie Thr Val Glu Arg 
290 295 300 

Leu Met Glu Ala Asp Gin Met Ser Glu Ala Arg Cys Gly Asp Lys Ser 
305 310 315 320 

lie Gin Tyr Leu Arg Val Ala Ala Ser Asn Thr Met lie Pro Pro Glu 

325 330 335 

Tyr Arg Ala Pro Val Ser Ala lie Cys Ala Met Val Asn Lys Gin Val 

340 345 350 

Phe Gin His Met Asp Phe Cys Arg Arg Leu Pro His Phe Thr Lys Leu 
355 360 365 

Pro Leu Asn Asp Gin Met Tyr Leu Leu Lys Gin Ser Leu Asn Glu Leu 
370 375 380 

Leu lie Leu Asn lie Ala Tyr Met Ser lie Gin Tyr Val Glu Pro Asp 
385 390 395 400 

Arg Arg Asn Ala Asp Gly Ser Leu Glu Arg Arg Gin lie Ser Gin Gin 

405 410 415 

Met Cys Leu Ser Arg Asn Tyr Thr Leu Gly Arg Asn Met Ala Val Gin 

420 425 430 

Ala Gly Val Val Gin lie Phe Asp Arg lie Leu Ser Glu Leu Ser Val 
435 440 445 

Lys Met Lys Arg Leu Asp Leu Asp Ala Thr Glu Leu Cys Leu Leu Lys 
450 455 460 

Ser lie Val Val Phe Asn Pro Asp Val Arg Thr Leu Asp Asp Arg Lys 
465 470 475 480 

Ser lie Asp Leu Leu Arg Ser Arg lie Tyr Ala Ser Leu Asp Glu Tyr 

485 490 495 

Cys Arg Gin Lys His Pro Asn Glu Asp Gly Arg Phe Ala Gin Leu Leu 

500 505 510 

Leu Arg Leu Pro Ala Leu Arg Ser lie Ser Leu Lys Cys Leu Asp His 
515 520 525 
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Leu Phe Tyr Phe 
530 

lie Glu Glu Phe 
545 



<210> 48 

<211> 508 

<212> PRT 

<213> Drosophi 

<400> 48 

Met Asp Asn Cys 
1 

Glu Glu Val Lys 

20 

Ser Phe Ser Pro 
35 

Ser Met Val His 
50 

Asn Ser Ala Gly 
65 

Gly Ser Ala Ala 



Leu Ser Gly Ser 

100 

Gly Lys His Tyr 
115 

Lys Arg Thr Val 
130 

Asn Cys lie lie 
145 

Tyr Gin Lys Cys 



Glu Arg Gin Arg 

180 

Gly Gly Gly Ser 
195 

Gly Gly Gly Gly 
210 

Gly Ser Asp Asp 
225 

Glu Arg lie lie 



Gin Leu lie Asp 
535 

His Lys Leu Asn 
550 



a melanogaster 



Asp Gin Asp Ala 
5 

Pro Asp lie Ser 



Lys Ala Glu Ser 

40 

Val Leu Pro Gly 
55 

Asp Ala Gin Met 

70 

Ala Ala Val Gin 
85 

Lys His Leu Cys 



Gly Val Tyr Ser 

120 

Arg Lys Asp Leu 
135 

Asp Lys Arg Gin 
150 

Leu Thr Cys Gly 
165 

Gly Ala Arg Asn 



Ser Gly Pro Gly 

200 

Gly Gly Gly Val 
215 

Phe Met Thr Asn 
230 

Glu Ala Glu Gin 
245 



Asp Lys Asn Val 

540 



Ser Phe Arg Leu 
10 

Gin Leu Asn Asp 
25 

Pro Val Pro Phe 



Ser Asn Ser Ala 

60 

Ala Gin Ala Pro 
75 

Gin Gin Tyr Pro 
90 

Ser lie Cys Gly 
105 

Cys Glu Gly Cys 



Thr Tyr Ala Cys 

140 

Arg Asn Arg Cys 
155 

Met Lys Arg Glu 
170 

Ala Ala Gly Arg 
185 

Ser Val Gly Gly 



Ser Gly Gly Met 

220 

Ser Val Ser Arg 
235 

Arg Ala Glu Thr 
250 



Glu Asn Ser Val 



Ser His lie Lys 
15 

Ser Asn Asn Ser 
30 

Met Gin Ala Met 
45 

Ser Ser Asn Asn 



Asn Ser Ala Gly 

80 

Pro Asn His Pro 
95 

Asp Arg Ala Ser 
110 

Lys Gly Phe Phe 
125 

Arg Glu Asn Arg 



Gin Tyr Cys Arg 

160 

Ala Val Gin Glu 
175 

Leu Ser Ala Ser 
190 

Ser Ser Ser Gin 
205 

Gly Ser Gly Asn 



Asp Phe Ser lie 

240 

Gin Cys Gly Asp 
255 
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Arg Ala Leu Thr Phe Leu Arg Val Gly Pro Tyr Ser Thr Val Gin Pro 

260 265 270 

Asp Tyr Lys Gly Ala Val Ser Ala Leu Cys Gin Val Val Asn Lys Gin 
5 275 280 285 

Leu Phe Gin Met Val Glu Tyr Ala Arg Met Met Pro His Phe Ala Gin 
290 295 300 

10 Val Pro Leu Asp Asp Gin Val lie Leu Leu Lys Ala Ala Trp lie Glu 
305 310 315 320 

Leu Leu lie Ala Asn Val Ala Trp Cys Ser lie Val Ser Leu Asp Asp 

325 330 335 

15 

Gly Gly Ala Gly Gly Gly Gly Gly Gly Leu Gly His Asp Gly Ser Phe 

340 345 350 

Glu Arg Arg Ser Pro Gly Leu Gin Pro Gin Gin Leu Phe Leu Asn Gin 
20 355 360 365 

Ser Phe Ser Tyr His Arg Asn Ser Ala lie Lys Ala Gly Val Ser Ala 
370 375 380 

25 lie Phe Asp Arg lie Leu Ser Glu Leu Ser Val Lys Met Lys Arg Leu 
385 390 395 400 

Asn Leu Asp Arg Arg Glu Leu Ser Cys Leu Lys Ala lie lie Leu Tyr 

405 410 415 

30 

Asn Pro Asp lie Arg Gly lie Lys Ser Arg Ala Glu lie Glu Met Cys 

420 425 430 

Arg Glu Lys Val Tyr Ala Cys Leu Asp Glu His Cys Arg Leu Glu His 
35 435 440 445 

Pro Gly Asp Asp Gly Arg Phe Ala Gin Leu Leu Leu Arg Leu Pro Ala 
450 455 460 

40 Leu Arg Ser lie Ser Leu Lys Cys Gin Asp His Leu Phe Leu Phe Arg 
465 470 475 480 

lie Thr Ser Asp Arg Pro Leu Glu Glu Leu Phe Leu Glu Gin Leu Glu 

485 490 495 

45 

Ala Pro Pro Pro Pro Gly Leu Ala Met Lys Leu Glu 

500 505 



50 <210> 49 
<211> 555 
<212> PRT 
<213> Bombyx mori 
<400> 49 

55 

Met His Glu Asp Ala Pro Lys Met Ser lie Ala Gin Ser Leu Ala Ala 
15 10 15 

Ser Thr Ser Gin Pro Lys Gly Asp lie Val Thr Glu lie Pro Leu Glu 
60 20 25 30 



74/97 



WO 02/077157 



PCT/US02/11257 



Phe Ala Met Ser 
35 

Glu Leu Lys lie 
50 

Pro Gly Ala Tyr 
65 

Thr Lys Asp Val 



Ser Gly Tyr His 

100 

Phe Lys Arg Thr 
115 

Arg Ala Cys His 
130 

Arg Phe Gin Lys 
145 

Ala Asp Arg Met 



Arg Asp Arg Ala 

180 

Val Gin Thr Leu 
195 

Phe Gly Ser Pro 
210 

Pro Gin Val Ser 
225 

Ala Leu Leu Arg 



Thr His Asp Lys 

260 

Ala Phe Thr Phe 
275 

Thr Ala Glu Ala 
290 

Arg Glu Phe Val 
305 

Phe Gly Leu Leu 



Phe Glu Leu Met 

340 



Ser Met Glu Thr 

40 

Thr Tyr Val Asp 
55 

Leu Pro Thr Ala 

70 

He Glu Glu Leu 
85 

Tyr Gly Leu Leu 



Val Gin Asn Lys 

120 

He Asp Lys Thr 
135 

Cys Leu Asp Val 
150 

Arg Gly Gly Arg 
165 

Arg Lys Leu Gin 



Arg Gly Ser Leu 

200 

Tyr Thr Ala Val 
215 

Ser Leu Thr Ser 
230 

Ala Gin Pro Gin 
245 

Trp Glu Ala His 



Asp Thr Gin Ser 

280 

Thr Ser Thr Glu 
295 

Gin Thr Val Asp 
310 

Gin Ser Gin Thr 
325 

Cys Lys Val Leu 



Lys Ser He Glu 



Pro Thr Thr Gly 

60 

Gly Thr Val Cys 
75 

Cys Pro Val Cys 
90 

Thr Cys Glu Ser 
105 

Lys Val Tyr Thr 



Gin Arg Lys Arg 

140 

Gly Met Lys Leu 
155 

Asn Lys Phe Gly 
170 

Met Met Arg Gin 
185 

Gly Asp Gly Gly 



Ser Val Lys Gin 

220 

Ser Pro Glu Ser 
235 

Pro Pro Gin Pro 
250 

Ser Pro His Ser 
265 

Asn Thr Ala Ala 



Thr Leu Arg Val 

300 

Asp Arg Glu Trp 
315 

Tyr Asn Gin Cys 
330 

Asp Gin Asn Leu 
345 



Thr Thr Asn Val 
45 

Thr Gly Gly Glu 



Asp Gin Thr Asp 

80 

Gly Asp Lys Val 
95 

Cys Lys Gly Phe 
110 

Cys Val Ala Glu 
125 

Cys Pro Phe Cys 



Glu Ala Val Arg 

160 

Pro Met Tyr Lys 
175 

Arg Gin He Ala 
190 

Leu Val Leu Gly 
205 

Glu He Gin He 



Ser Pro Gly Pro 

240 

Pro Pro Pro Pro 
255 

Ala Ser Pro Asp 
270 

Thr Pro Ser Ser 
285 

Ser Pro Met He 



Gin Asn Ala Leu 

320 

Glu Val Asp Leu 
335 

Phe Ser Gin Val 
350 
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Asp Trp Ala Arg 
355 

Gin Met Lys Leu 
370 

His Leu His Gin 
385 

His Asn Gly Gin 



Pro Ser Leu Ala 

420 

Leu Lys Phe Asp 
435 

Leu Asn Pro Glu 
450 

Gly Tyr Gin Thr 
465 

Ser Thr lie Gin 



lie His Ala Leu 

500 

Val Gin Ala Arg 
515 

Gin Asn Ala Asn 
530 

Arg Ser Ala Lys 
545 



Asn Thr Val Phe 

360 

Leu Gin Asp Ser 
375 

Arg Met His Asn 
390 

Lys Phe Asp Leu 
405 

Asp His Phe Asn 



Val Pro Asp Tyr 

440 

Val Arg Gly lie 
455 

Val Gin Ala Ala 
470 

Asp Lys Phe Gly 
485 

Arg Leu Gly Glu 



His Leu Pro Arg 

520 

Leu Glu Val Pro 
535 

Pro Arg Arg His 
550 



Phe Lys Tyr Leu 



Trp Ser Val Met 

380 

Gly Leu Pro Asp 
395 

Leu Cys Leu Gly 
410 

Glu Leu Gin Asn 
425 

lie Cys Val Lys 



Val Asn Val Lys 

460 

Leu Leu Asp Tyr 
475 

Lys Leu Val Met 
490 

Lys Ser Thr Cys 
505 

Leu Phe Ser Trp 



Val Thr Asn Lys 

540 

His Asn Lys 
555 



Lys Val Asp Asp 
365 

Leu Val Leu Asp 



Glu Thr Thr Leu 

400 

Leu Leu Gly Val 
415 

Lys Leu Ala Glu 
430 

Phe Met Leu Leu 
445 

Cys Val Arg Glu 



Pro Tyr Leu Leu 

480 

Val Val Pro Glu 
495 

Thr Ser Gly lie 
510 

Lys Cys Cys Thr 
525 

Val Glu Glu Leu 



<210> 50 
<211> 699 
<212> PRT 

<213> Manduca sexta 
<400> 50 

Met Thr Leu Val Met Ser Pro Asp Ser Ser Tyr Gly Arg Tyr Asp Ala 
15 10 15 

Pro Thr Pro Thr Asp Val Thr Ser Pro Val His Arg Glu Arg Glu Pro 

20 25 30 

Glu Leu His lie Glu Phe Asp Gly Thr Thr Val Leu Cys Arg Val Cys 
35 40 45 

Gly Asp Lys Ala Ser Gly Phe His Tyr Gly Val His Ser Cys Glu Gly 
50 55 60 

Cys Lys Gly Phe Phe Arg Arg Ser lie Gin Gin Lys lie Gin Tyr Arg 
65 70 75 80 
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Pro Cys Thr Lys 



Arg Cys Gin Tyr 

100 

Arg Asp Ala Val 
115 

lie Leu Ala Ala 
130 

Ala Ala Ala Ala 
145 

Val Arg Ala His 



Ala Met Arg Ala 

180 

Leu Ala Cys Pro 
195 

Phe Ser Gin Arg 
210 

Gly Leu lie Pro 
225 

Leu Leu Lys Ser 



Met Phe Asp Ala 

260 

Met Lys Arg Asp 
275 

Asp Ser Thr Phe 
290 

Asp Ala Glu lie 

305 

Arg Pro Gly Leu 



Leu Lys Ala Cys 

340 

Pro Gly Phe Leu 
355 

Leu Ser Thr Leu 
370 

Lys Glu Leu Leu 
385 



Asn Gin Gin Cys 
85 

Cys Arg Leu Lys 



Arg Phe Gly Arg 

120 

Met Gin Gin Ser 
135 

Glu Leu Asp Asp 
150 

Leu Asp Thr Cys 
165 

Arg Ala Arg Asp 



Leu Asn Pro Ala 

200 

Phe Ala His Val 
215 

Gly Phe Gin Leu 
230 

Gly Leu Phe Asp 
245 

Pro Leu Asn Ser 



Ser lie Gin Ser 

280 

Lys Phe Ala Glu 
295 

Gly Leu Phe Cys 
310 

Arg Asn Val Glu 
325 

Leu Gin Thr Val 



Arg Glu Leu Met 

360 

His Thr Glu Lys 
375 

Arg Gin Gin Met 
390 



Ser He Leu Arg 
90 

Lys Cys He Ala 
105 

Val Pro Lys Arg 



Ser Thr Ser Arg 

140 

Ala Pro Arg Leu 
155 

Glu Phe Thr Arg 
170 

Cys Pro Thr Tyr 
185 

Pro Glu Leu Gin 



He Arg Gly Val 

220 

Leu Thr Gin Asp 
235 

Ala Leu Phe Val 
250 

He He Cys Leu 
265 

Gly Ala Asn Ala 



Arg Met Asn Ser 

300 

Ala He Val Leu 
315 

Leu Val Glu Arg 
330 

He Ala Gin Asn 
345 

Asp Thr Leu Pro 



Leu Val Val Phe 

380 

Trp Ser Glu Glu 
395 



He Asn Arg Asn 
95 

Val Gly Met Ser 
110 

Glu Lys Ala Arg 
125 

Ala His Glu Gin 



Leu Ala Arg Val 

160 

Asp Arg Val Ala 
175 

Ser Gin Pro Thr 
190 

Ser Glu Lys Glu 
205 

He Asp Phe Ala 



Asp Lys Phe Thr 

240 

Arg Leu He Cys 
255 

Asn Gly Gin Leu 
270 

Arg Phe Leu Val 
285 

Met Asn Leu Thr 



He Thr Pro Asp 

320 

Met His Thr Arg 
335 

Arg Pro Asp Arg 
350 

Asp Leu Arg Thr 
365 

Arg Thr Glu His 



Glu Ala Val Ser 

400 
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Trp Val Asp 



Val Ser Ser 

5 

Leu Leu Ala 
435 

10 Ser Val Asp 
450 

Leu Thr Val 
465 

15 

Ser Pro Thr 



lie Val Gly 

20 

Glu His Asn 
515 

25 Val Leu Lys 
530 

Ser Leu Met 
545 

30 

Arg Arg Asp 



Ser Pro Gin 

35 

His Ser Pro 
595 

40 Leu Ala Lys 
610 

Lys Arg Thr 
625 

45 

Ala Pro Ala 



Tyr Arg Gly 

50 

Asp Val Thr 
675 

55 Pro Arg Thr 
690 



Ser Gly Ala Asp 
405 

Ser Glu Ser Gly 
420 

Ala Thr Leu Ala 



Glu Glu Ala Leu 

455 

Thr Pro Val Arg 
470 

Asp Ser Gly lie 
485 

Thr Gly Ser Gly 
500 

Glu Asp Arg Arg 



Arg Val Leu Gin 

535 

Asp Glu Ala Tyr 
550 

Thr Gly Glu Ala 
565 

Pro Gin His Pro 
580 

Arg Pro Gin Arg 



Ser Leu Met Glu 

615 

Asp lie lie Gin 

630 

Glu Gly Cys Pro 
645 

Ala Ser Pro Ala 
660 

Asp Ala Pro Leu 



Tyr Met Pro Gin 

695 



Glu Leu Ala Arg Ser 
410 

Glu Ala Val Gly Asp 
425 

Gly Arg Arg Arg Leu 
440 

Gly Val Ala His Leu 

460 

Gin Pro Pro Arg Tyr 

475 

Glu Ser Gly Asn Glu 
490 

Cys Ser Ser Pro Arg 
505 

Pro Pro Val Ser Ala 
520 

Ala Pro Pro Leu Tyr 

540 

Arg Arg His Lys Lys 

555 

Glu Ala Arg Thr Val 
570 

His Pro Ala Asn Pro 
585 

Ala Ser Leu Ser Ser 
600 

Gly Pro Arg Met Thr 

620 

Gin Tyr Met Arg Arg 

635 

Leu Arg Ala Gly Gly 
650 

Pro Gin Pro Val Leu 
665 

Asn Leu Ser Lys Lys 
680 

Met Leu Glu Ala 



Pro lie Gly Ser 
415 

Cys Gly Thr Pro 
430 

Asp Ser Arg Gly 
445 

Ala His Asn Gly 



Arg Lys Leu Asp 

480 

Lys His Glu Arg 
495 

Ser Ser Leu Glu 
510 

Asp Asp Met Pro 
525 

Gly Gly Thr Pro 



Phe Arg Ala Leu 

560 

Arg Pro Thr Pro 
575 

Ala His Pro Ala 
590 

Thr His Ser Val 
605. 

Pro Glu Gin Leu 



Gly Glu Ser Ser 

640 

Leu Leu Thr Cys 
655 

Ala Leu Gin Val 
670 

Ser Pro Ser Pro 
685 



<210> 51 
60 <211> 1394 
<212> PRT 
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<213> Drosophila melanogaster 
<400> 51 



Met Val Cys Ala 
1 

Gin Gin Gin Leu 

20 

Thr Gin Gin Gin 
35 

Gly Gly Asn Leu 
50 

His Gin Leu His 
65 

Ala Lys Ser Gin 



Leu Glu Ser Ala 

100 



Met Gin Glu Val 

5 

Gin Leu Pro Gin 



His Ala Thr Thr 

40 

His lie Val Ala 
55 

His Gin His Gin 
70 

Gin Leu Lys Gin 
85 

Pro lie Lys Gin 



Ala Ala Val Gin 

10 

Gin Gin Gin Gin 
25 

lie Val Leu Leu 



Thr Pro Gin Gin 

60 

His Gin His Gin 
75 

Gin His Ser Ala 
90 

Gin Gin Gin Thr 
105 



His Gin Gin Gin 
15 

Gin Gin Gin Thr 
30 

Thr Gly Asn Gly 
45 

His Gin Pro Met 



His Gin Gin Gin 

80 

Leu Val Lys Leu 
95 

Pro Lys Gin lie 
110 



Val Tyr Leu Gin Gin Gin Gin Gin Gin Pro Gin Arg Lys Arg Leu Lys 

115 120 *} 125 

Asn Glu Ala Ala lie Val Gin Gin Gin Gin Gin Thr Pro Ala Thr Leu 

130 135 140 



Val Lys Thr Thr 
145 

Thr Asn Ser lie 



His Gin Gin Pro 

180 

Ser Ala Lys Asn 
195 

Ser Asp Glu Asp 
210 

Asp Ser Ser Tyr 
225 

Ala Arg Glu Leu 



Gly Gly Ser Asn 

260 



Thr Thr Ser Asn 
150 

Ser Gin Gin Gin 
165 

Ala Ala Ala Ala 



Asp Ser Glu Ser 

200 

Cys Pro Asn Ala 
215 

Glu Gin Tyr Gin 
230 

Leu Lys Gin Arg 
245 

Ala Gin Gin Gin 



Ser Asn Ser Asn 
155 

Gin Gin His Gin 
170 

Thr Pro Lys Pro 
185 

Gly lie Asp Glu 



Asn Pro Ala Gly 

220 

Cys Pro Trp Lys 
235 

Glu Leu Glu Gin 

250 

Val Glu Ala Lys 
265 



Asn Thr Gin Thr 

160 

lie Val Leu Gin 
175 

Cys Ala Asp Leu 
190 

Asp Cys Pro Asn 
205 

Thr Ser Leu Glu 



Lys lie Arg Tyr 

240 

Gin Gin Thr Thr 
255 

Pro Ala Ala lie 
270 



Pro Thr Ser Asn lie Lys Gin Leu His Cys Asp Ser Pro Phe Ser Ala 

275 280 285 

Gin Thr His Lys Glu lie Ala Asn Leu Leu Arg Gin Gin Ser Gin Gin 

290 295 300 
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Gin Gin Val 
305 

Gin His Gin 

5 

Met Ser Asn 



10 Ala Gly Asp 

355 

Gly Cys Asp 
370 

15 

Ser Gin Leu 
385 

Ala Leu Ser 

20 

Thr Ala Asn 



25 Lys He Gin 

435 

Arg He Asn 
450 

30 

Ala Val Gly 
465 

Arg Glu Lys 

35 

Arg Gly Gin 



40 Leu Leu Ala 

515 

Lys Glu Lys 
530 

45 

Tyr Ser Met 
545 

Leu Gin Ser 

50 

Gly Val He 



55 Gin Asp Asp 

595 

Phe Val Arg 
€10 

60 



Val Ala Thr Gin 
310 

Gin Gin Arg Arg 
325 

Ser Ser Asn Ser 
340 

Asp Gin Gin Leu 



Asp Glu Leu Cys 

375 

Asn Tyr Leu Cys 
390 

Asn Ser Ser Ala 
405 

Glu Asp Ala Asp 
420 

Tyr Arg Pro Cys 



Arg Asn Arg Cys 

455 

Met Ser Arg Asp 
470 

Ala Arg He Leu 
485 

Gin Arg Ala Leu 
500 

Ala Val Leu Arg 



Val Ser Ala Met 

535 

Pro Thr Leu Leu 
550 

Glu Gin Glu Phe 
565 

Asp Phe Ala Gly 
580 

Lys Phe Thr Leu 



Leu He Cys Met 

615 



Gin Gin Gin Gin 

315 

Asp Ser Ser Asp 
330 

Ser Ala Gly Asn 
345 

Glu Glu Met Asp 
360 

Glu Gin His His 



Gin Lys Phe Asp 

395 

Asn Thr Gly Arg 
410 

Gly Phe Phe Arg 
425 

Thr Lys Asn Gin 
440 

Gin Tyr Cys Arg 



Ala Val Arg Phe 

475 

Ala Ala Met Gin 
490 

Ala Thr Glu Leu 
505 

r 

Ala His Leu Glu 
520 

Arg Gin Arg Ala 



Ala Cys Pro Leu 

555 

Ser Gin Arg Phe 
570 

Met He Pro Gly 
585 

Leu Lys Ala Gly 
600 

Phe Asp Ser Ser 



Gin Gin Gin Gin His 

320 

Ser Asn Cys Ser Leu 

335 

Cys Cys Thr Cys Asn 
350 

Glu Ala His Asp Ser 
365 

Gin Arg Leu Asp Ser 
380 

Glu Lys Leu Asp Thr 

400 

Asn Thr Pro Ala Val 

415 

Arg Ser He Gin Gin 
430 

Gin Cys Ser He Leu 
445 

Leu Lys Lys Cys He 
460 

Gly Arg Val Pro Lys 

480 

Gin Ser Thr Gin Asn 

495 

Asp Asp Gin Pro Arg 
510 

Thr Cys Glu Phe Thr 
525 

Arg Asp Cys Pro Ser 
540 

Asn Pro Ala Pro Glu 

560 

Ala His Val He Arg 

575 

Phe Gin Leu Leu Thr 
590 

Leu Phe Asp Ala Leu 
605 

He Asn Ser He He 
620 
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Cys Leu Asn Gly Gin Val Met Arg Arg Asp Ala He Gin Asn Gly Ala 
625 630 635 640 

Asn Ala Arg Phe Leu Val Asp Ser Thr Phe Asn Phe Ala Glu Arg Met 
5 645 650 655 

Asn Ser Met Asn Leu Thr Asp Ala Glu He Gly Leu Phe Cys Ala He 

660 665 670 

10 Val Leu He Thr Pro Asp Arg Pro Gly Leu Arg Asn Leu Glu Leu He 

675 680 685 

Glu Lys Met Tyr Ser Arg Leu Lys Gly Cys Leu Gin Tyr He Val Ala 
690 695 700 

15 

Gin Asn Arg Pro Asp Gin Pro Glu Phe Leu Ala Lys Leu Leu Glu Thr 
705 710 715 720 

Met Pro Asp Leu Arg Thr Leu Ser Thr Leu His Thr Glu Lys Leu Val 
20 725 730 735 

Val Phe Arg Thr Glu His Lys Glu Leu Leu Arg Gin Gin Met Trp Ser 

740 745 750 

25 Met Glu Asp Gly Asn Asn Ser Asp Gly Gin Gin Asn Lys Ser Pro Ser 

755 760 765 

Gly Ser Trp Ala Asp Ala Met Asp Val Glu Ala Ala Lys Ser Pro Leu 
770 775 780 

30 

Gly Ser Val Ser Ser Thr Glu Ser Ala Asp Leu Asp Tyr Gly Ser Pro 
785 790 795 800 

Ser Ser Ser Gin Pro Gin Gly Val Ser Leu Pro Ser Pro Pro Gin Gin 
35 805 810 815 

Gin Pro Ser Ala Leu Ala Ser Ser Ala Pro Leu Leu Ala Ala Thr Leu 

820 825 830 

40 Ser Gly Gly Cys Pro Leu Arg Asn Arg Ala Asn Ser Gly Ser Ser Gly 

835 840 845 

Asp Ser Gly Ala Ala Glu Met Asp He Val Gly Ser His Ala His Leu 
850 855 860 

45 

Thr Gin Asn Gly Leu Thr lie Thr Pro He Val Arg His Gin Gin Gin 
865 870 875 880 

Gin Gin Gin Gin Gin Gin He Gly He Leu Asn Asn Ala His Ser Arg 
50 885 890 895 

Asn Leu Asn Gly Gly His Ala Met Cys Gin Gin Gin Gin Gin His Pro 

900 905 910 

55 Gin Leu His His His Leu Thr Ala Gly Ala Ala Arg Tyr Arg Lys Leu 

915 920 925 

Asp Ser Pro Thr Asp Ser Gly He Glu Ser Gly Asn Glu Lys Asn Glu 
930 935 940 

60 
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Cys Lys Ala Val Ser Ser Gly Gly Ser Ser Ser Cys Ser Ser Pro Arg 
945 950 955 960 

Ser Ser Val Asp Asp Ala Leu Asp Cys Ser Asp Ala Ala Ala Asn His 

965 970 975 

Asn Gin Val Val Gin His Pro Gin Leu Ser Val Val Ser Val Ser Pro 

980 985 990 

Val Arg Ser Pro Gin Pro Ser Thr Ser Ser His Leu Lys Arg Gin lie 
995 1000 1005 

Val Glu Asp Met Pro Val Leu Lys Arg Val Leu Gin Ala Pro Pro 
1010 • 1015 1020 

Leu Tyr Asp Thr Asn Ser Leu Met Asp Glu Ala Tyr Lys Pro His 
1025 1030 1035 

Lys Lys Phe Arg Ala Leu Arg His Arg Glu Phe Glu Thr Ala Glu 
1040 1045 1050 

Ala Asp Ala Ser Ser Ser Thr Ser Gly Ser Asn Ser Leu Ser Ala 
1055 1060 1065 

Gly Ser Pro Arg Gin Ser Pro Val Pro Asn Ser Val Ala Thr Pro 
1070 1075 1080 

Pro Pro Ser Ala Ala Ser Ala Ala Ala Gly Asn Pro Ala Gin Ser 
1085 1090 1095 

Gin Leu His Met His Leu Thr Arg Ser Ser Pro Lys Ala Ser Met 
1100 1105 1110 

Ala Ser Ser His Ser Val Leu Ala Lys Ser Leu Met Ala Glu Pro 
1115 1120 1125 

Arg Met Thr Pro Glu Gin Met Lys Arg Ser Asp lie lie Gin Asn 
1130 1135 1140 

Tyr Leu Lys Arg Glu Asn Ser Thr Ala Ala Ser Ser Thr Thr Asn 
1145 1150 1155 

Gly Val Gly Asn Arg Ser Pro Ser Ser Ser Ser Thr Pro Pro Pro 
1160 1165 1170 

Ser Ala Val Gin Asn Gin Gin Arg Trp Gly Ser Ser Ser Val lie 
1175 1180 1185 

Thr Thr Thr Cys Gin Gin Arg Gin Gin Ser Val Ser Pro His Ser 
1190 1195 1200 

Asn Gly Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser Ser 
1205 1210 1215 

Ser Ser Ser Ser Thr Ser Ser Asn Cys Ser Ser Ser Ser Ala Ser 
1220 1225 1230 

Ser Cys Gin Tyr Phe Gin Ser Pro His Ser Thr Ser Asn Gly Thr 
1235 1240 1245 
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Ser Ala Pro 
1250 

Leu Leu Glu 
1265 

Asn Leu Ser 
1280 

Ala Leu Val 

1295 

Ser Ala Asp 
1310 

Gly Gly Gly 
1325 

Gly Leu Pro 
1340 

Ala Gly Gly 
1355 

Lys Trp Glu 
1370 

Lys Gin Asp 
1385 



Ala Ser Ser 
Leu Gin Val 
Lys Lys Ser 
Ala Ala Ala 
Val Thr Val 
Glu Ser Gly 
Gin Ser Gly 
Val Arg Ala 
Arg Gin Arg 
His Leu Glu 



Ser Ser Gly 
1255 

Asp lie Ala 
1270 

Pro Thr Pro 
1285 

Asn Ala Val 
1300 

Thr Ala Ser 
1315 

Arg Gin Gin 
1330 

Pro Glu Arg 
1345 

Gly Gly Gly 
1360 

Leu Gly Val 
1375 

Arg Arg Glu 



Ser Asn Ser 
1260 

Asp Ser Ala 
1275 

Pro Pro Ser 
1290 

Gin Arg Tyr 
1305 

Asn Gly Gly 
1320 

Gin Ser Ala 
1335 

Arg Arg Ala 
1350 

Arg Trp Phe 
1365 

Ala Val Gin 
1380 

Leu Asn 



Ala Thr Pro 
Gin Pro Leu 
Lys Leu His 
Pro Thr Leu 
Ser Ser Val 
Gly Glu Cys 
Gin Gly Asn 
Tyr Ala Glu 
Arg Ser Arg 



<210> 52 

<211> 685 

<212> PRT 

<213> Manduca 

<400> 52 

Met Val Arg Ala 
1 

Val Leu Val Ser 

20 

Cys Ser Ser Asp 
35 

Cys Asp Pro Gin 
50 

Tyr Arg Pro Cys 
65 

Arg Asn Arg Cys 



Met Ser Arg Asp 

100 

Ala Arg lie Leu 
115 



sexta 



Met Ser Cys Gly 
5 

Met Leu Glu Ala 



Asp Gly Ser Asp 

40 

Gly Phe Phe Arg 
55 

Thr Lys Asn Gin 

70 

Gin Tyr Cys Arg 
85 

Ala Val Arg Phe 



Ala Ala Met Gin 

120 



Ala Glu Leu Arg 
10 

Arg Arg Glu Ser 
25 

Val Glu Arg Asp 



Arg Ser lie Gin 

60 

Gin Cys Ser lie 
75 

Leu Lys Lys Cys 
90 

Gly Arg Val Pro 
105 

Gin Ser Ser Thr 



Glu Arg His Ser 
15 

Ser Asp Ser Gly 
30 

Cys Lys Cys Arg 
45 

Gin Lys lie Gin 



Leu Arg lie Asn 

80 

lie Ala Val Gly 
95 

Lys Arg Glu Lys 
110 

Ser Arg Ala His 
125 
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Glu Gin Ala Ala Ala Ala Glu Leu Asp Asp Ala Pro Arg Leu Leu Ala 
130 135 140 

Arg Val Val Arg Ala His Leu Asp Thr Cys Glu Phe Thr Arg Asp Arg 
145 150 155 160 

Val Ala Ala Met Arg Ala Arg Ala Arg Asp Cys Pro Thr Tyr Ser Gin 

165 170 175 

Pro Thr Leu Ala Cys Pro Leu Asn Pro Ala Pro Glu Leu Gin Ser Glu 

180 185 190 

Lys Glu Phe Ser Gin Arg Phe Ala His Val lie Arg Gly Val lie Asp 
195 200 205 

Phe Ala Gly Leu lie Pro Gly Phe Gin Leu Leu Thr Gin Asp Asp Lys 
210 215 220 

Phe Thr Leu Leu Lys Ser Gly Leu Phe Asp Ala Leu Phe Val Arg Leu 
225 230 235 240 

lie Cys Met Phe Asp Ala Pro Leu Asn Ser lie lie Cys Leu Asn Gly 

245 250 255 

Gin Leu Met Lys Arg Asp Ser lie Gin Ser Gly Ala Asn Ala Arg Phe 

260 265 270 

Leu Val Asp Ser Thr Phe Lys Phe Ala Glu Arg Met Asn Ser Met Asn 
275 280 285 

Leu Thr Asp Ala Glu lie Gly Leu Phe Cys Ala He Val Leu He Thr 
290 295 300 

Pro Asp Arg Pro Gly Leu Arg Asn Val Glu Leu Val Glu Arg Met His 
305 310 315 320 

Thr Arg Leu Lys Ala Cys Leu Gin Thr Val He Ala Gin Asn Arg Pro 

325 330 335 

Asp Arg Pro Gly Phe Leu Arg Glu Leu Met Asp Thr Leu Pro Asp Leu 

340 345 350 

Arg Thr Leu Ser Thr Leu His Thr Glu Lys Leu Val Val Phe Arg Thr 
355 360 365 

Glu His Lys Glu Leu Leu Arg Gin Gin Met Trp Ser Glu Glu Glu Ala 
370 375 380 

Val Ser Trp Val Asp Ser Gly Ala Asp Glu Leu Ala Arg Ser Pro He 
385 390 395 400 

Gly Ser Val Ser Ser Ser Glu Ser Gly Glu Ala Val Gly Asp Cys Gly 

405 410 415 

Thr Pro Leu Leu Ala Ala Thr Leu Ala Gly Arg Arg Arg Leu Asp Ser 

420 425 430 

Arg Gly Ser Val Asp Glu Glu Ala Leu Gly Val Ala His Leu Ala His 
435 440 445 
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Asn Gly Leu Thr 
450 

Leu Asp Ser Pro 
465 

Glu Arg lie Val 



Leu Glu Glu His 

500 

Met Pro Val Leu 
515 

Thr Pro Ser Leu 
530 

Ala Leu Arg Arg 
545 

Thr Pro Ser Pro 



Pro Ala His Ser 

580 

Ser Val Leu Ala 
595 

Gin Leu Lys Arg 
610 

Ser Ser Ala Pro 
625 

Thr Cys Tyr Arg 



Gin Val Asp Val 

660 

Ser Pro Pro Arg 
675 



Val Thr Pro Val 
455 

Thr Asp Ser Gly 
470 

Gly Thr Gly Ser 
485 

Asn Glu Asp Arg 



Lys Arg Val Leu 

520 

Met Asp Glu Ala 
535 

Asp Thr Gly Glu 
550 

Gin Pro Gin His 
565 

Pro Arg Pro Gin 



Lys Ser Leu Met 

600 

Thr Asp lie lie 
615 

Ala Glu Gly Cys 
630 

Gly Ala Ser Pro 
645 

Thr Asp Ala Pro 



Thr Tyr Met Pro 

680 



Arg Gin Pro Pro 

460 

lie Glu Ser Gly 
475 

Gly Cys Ser Ser 
490 

Arg Pro Pro Val 
505 

Gin Ala Pro Pro 



Tyr Arg Arg His 

540 

Ala Glu Ala Arg 
555 

Pro His Pro Ala 
570 

Arg Ala Ser Leu 
585 

Glu Gly Pro Arg 



Gin Gin Tyr Met 

620 

Pro Leu Arg Ala 
635 

Ala Pro Gin Pro 
650 

Leu Asn Leu Ser 
665 

Gin Met Leu Glu 



Arg Tyr Arg Lys 



Asn Glu Lys His 

480 

Pro Arg Ser Ser 
495 

Ser Ala Asp Asp 
510 

Leu Tyr Gly Gly 
525 

Lys Lys Phe Arg 



Thr Val Arg Pro 

560 

Asn Pro Ala His 
575 

Ser Ser Thr His 
590 

Met Thr Pro Glu 
605 

Arg Arg Gly Glu 



Gly Gly Leu Leu 

640 

Val Leu Ala Leu 
655 

Lys Lys Ser Pro 
670 

Ala 
685 



<210> 53 
<211> 1443 
<212> PRT 

<213> Drosophila melanogaster 
<400> 53 

Met His Gly Gly Gly Pro Gly Ser Ser Gly Ser Asn lie lie Arg Arg 
15 10 15 

Ser Ser Gly Ser Phe Pro Gly Ser Gly Ser Gly Ser Ala Ser Lys Leu 

20 25 30 

lie Lys Thr Glu Pro lie Asp Phe Glu Met Leu His Leu Glu Glu Asn 
35 40 45 
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Glu Arg Gin Gin Asp lie Glu Arg Glu Pro Ser Ser Ser Asn Ser Asn 
50 55 60 

Ser Asn Ser Asn Ser Leu Thr Pro Gin Arg Tyr Thr His Val Gin Val 
65 70 75 80 

Gin Thr Val Pro Pro Arg Gin Pro Thr Gly Leu Thr Thr Pro Gly Gly 

85 90 95 

Thr Gin Lys Val lie Leu Thr Pro Arg Val Glu Tyr Val Gin Gin Arg 

100 105 110 

Ala Thr Ser Ser Thr Gly Gly Gly Met Lys His Val Tyr Ser Gin Gin 
115 120 125 

Gin Gly Thr Ala Ala Ser Arg Ser Ala Pro Pro Glu Thr Thr Ala Leu 
130 135 140 

Leu Thr Thr Thr Ser Gly Thr Pro Gin lie lie lie Thr Arg Thr Leu 
145 150 155 160 

Pro Ser Asn Gin His Leu Ser Arg Arg His Ser Ala Ser Pro Ser Ala 

165 170 175 

Leu His His Tyr Gin Gin Gin Gin Gin Pro Gin Arg Gin Gin Ser Pro 

180 185 190 

Pro Pro Leu His His Gin Gin Gin Gin Gin Gin Gin His Val Arg Val 
195 200 205 

lie Arg Asp Gly Arg Leu Tyr Asp Glu Ala Thr Val Val Val Ala Ala 

210 215 220 

Arg Arg His Ser Val Ser Pro Pro Pro Leu His His His Ser Arg Ser 
225 230 235 240 

Ala Pro Val Ser Pro Val He Ala Arg Arg Gly Gly Ala Ala Ala Tyr 

245 250 255 

Met Asp Gin Gin Tyr Gin Gin Arg Gin Thr Pro Pro Leu Ala Pro Pro 

260 265 270 

Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Gin Gin Gin 
275 280 285 

Gin Gin Gin Tyr He Ser Thr Gly Val Pro Pro Pro Thr Ala Ala Ala 
290 295 300 

Arg Lys Phe Val Val Ser Thr Ser Thr Arg His Val Asn Val lie Ala 
305 310 315 320 

Ser Asn His Phe Gin Gin Gin Gin Gin Gin His Gin Ala Gin Gin His 

325 330 335 

Gin Gin Gin His Gin Gin His Val He Ala Ser Val Ser Ser Ser Ser 

340 345 350 

Ser Ser Ser Ala He Gly Ser Gly Gly Ser Ser Ser Ser His He Phe 
355 360 365 
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Arg Thr Pro Val Val Ser Ser Ser Ser Ser Ser Asn Met His His Gin 
370 375 380 

Gin Gin Gin Gin Gin Gin Gin Ser Ser Leu Gly Asn Ser Val Met Arg 
385 390 395 400 

Pro Pro Pro Pro Pro Pro Pro Pro Lys Val Lys His Ala Ser Ser Ser 

405 410 415 

Ser Ser Gly Asn Ser Ser Ser Ser Asn Thr Asn Asn Ser Ser Ser Ser 

420 425 430 

Ser Asn Gly Glu Glu Pro Ser Ser Ser lie Pro Asp Leu Glu Phe Asp 
435 440 445 

Gly Thr Thr Val Leu Cys Arg Val Cys Gly Asp Lys Ala Ser Gly Phe 
450 455 460 

His Tyr Gly Val His Ser Cys Glu Gly Cys Lys Gly Phe Phe Arg Arg 
465 470 475 480 

Ser lie Gin Gin Lys lie Gin Tyr Arg Pro Cys Thr Lys Asn Gin Gin 

485 490 495 

Cys Ser lie Leu Arg lie Asn Arg Asn Arg Cys Gin Tyr Cys Arg Leu 

500 505 510 

Lys Lys Cys lie Ala Val Gly Met Ser Arg Asp Ala Val Arg Phe Gly 
515 520 525 

Arg Val Pro Lys Arg Glu Lys Ala Arg lie Trp Arg Pro Cys Asn Arg 
530 535 540 

Ala Pro Arg lie Ala Ala Ser Ser Asp Pro Ser Pro Pro Ser Trp Met 
545 550 555 560 

Thr Ser His Ala Ser Ser Pro Pro Cys Cys Cys Ala His Leu Glu Thr 

565 570 575 

Cys Glu Phe Thr Lys Glu Lys Val Ser Ala Met Arg His Gly Arg Gly 

580 585 590 

Leu Pro Ser Thr Pro Cys His Thr Ser Gly Leu Ser Ala Glu Pro Ala 
595 600 605 

Pro Glu Leu Gin Ser Glu Gin Glu Phe Ser Gin Arg Phe Ala His Val 
610 615 620 

lie Arg Gly Val lie Asp Phe Ala Gly Met lie Pro Gly Phe Gin Leu 
625 630 635 640 

Leu Thr Gin Asp Asp Lys Phe Thr Leu Leu Lys Ala Gly Leu Phe Asp 

645 650 655 

Ala Leu Phe Val Arg Leu lie Cys Met Phe Asp Ser Ser lie Asn Ser 

660 665 670 

lie lie Cys Leu Asn Gly Gin Val Met Arg Arg Asp Ala lie Gin Asn 
675 680 685 
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Gly Ala Asn Ala Arg Phe Leu Val Asp Ser Thr Phe Asn Phe Ala Glu 
690 695 700 

Arg Met Asn Ser Met Asn Leu Thr Asp Ala Glu lie Gly Leu Phe Cys 
705 710 715 720 

Ala lie Val Leu lie Thr Pro Asp Arg Pro Gly Leu Arg Asn Leu Glu 

725 730 735 

Leu lie Glu Lys Met Tyr Ser Arg Leu Lys Gly Cys Leu Gin Tyr lie 

740 745 750 

Val Ala Gin Asn Arg Pro Asp Gin Pro Glu Phe Leu Ala Lys Leu Leu 
755 760 765 

Glu Thr Met Pro Asp Leu Arg Thr Leu Ser Thr Leu His Thr Glu Lys 
770 775 780 

Leu Val Val Phe Arg Thr Glu His Lys Glu Leu Leu Arg Gin Gin Met 
785 790 795 800 

Trp Ser Met Glu Asp Gly Asn Asn Ser Asp Gly Gin Gin Asn Lys Ser 

805 810 815 

Pro Ser Gly Ser Trp Ala Asp Ala Met Asp Val Glu Ala Ala Lys Ser 

820 825 830 

Pro Leu Gly Ser Val Ser Ser Thr Glu Ser Ala Asp Leu Asp Tyr Gly 
835 840 845 

Ser Pro Ser Ser Ser Gin Pro Gin Gly Val Ser Leu Pro Ser Pro Pro 
850 855 860 

Gin Gin Gin Pro Ser Ala Leu Ala Ser Ser Ala Pro Leu Leu Ala Ala 
865 870 875 880 

Thr Leu Ser Gly Gly Cys Pro Leu Arg Asn Arg Ala Asn Ser Gly Ser 

885 890 895 

Ser Gly Asp Ser Gly Ala Ala Glu Met Asp lie Val Gly Ser His Ala 

900 905 910 

His Leu Thr Gin Asn Gly Leu Thr lie Thr Pro lie Val Arg His Gin 
915 920 925 

Gin Gin Gin Gin Gin Gin Gin Gin lie Gly lie Leu Asn Asn Ala His 
930 935 940 

Ser Arg Asn Leu Asn Gly Gly His Ala Met Cys Gin Gin Gin Gin Gin 
945 950 955 960 

His Pro Gin Leu His His His Leu Thr Ala Gly Ala Ala Arg Tyr Arg 

965 970 975 

Lys Leu Asp Ser Pro Thr Asp Ser Gly lie Glu Ser Gly Asn Glu Lys 

980 985 990 

Asn Glu Cys Lys Ala Val Ser Ser Gly Gly Ser Ser Ser Cys Ser Ser 
995 1000 1005 
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Pro Arg Ser Ser Val Asp Asp Ala Leu Asp Cys Ser Asp Ala Ala 
1010 1015 1020 

Ala Asn His Asn Gin Val Val Gin His Pro Gin Leu Ser Val Val 
5 1025 1030 1035 

Ser Val Ser Pro Val Arg Ser Pro Gin Pro Ser Thr Ser Ser His 
1040 1045 1050 

10 Leu Lys Arg Gin lie Val Glu Asp Met Pro Val Leu Lys Arg Val 
1055 1060 1065 

Leu Gin Ala Pro Pro Leu Tyr Asp Thr Asn Ser Leu Met Asp Glu 
1070 1075 1080 



15 



30 



45 



Ala Tyr Lys Pro His Lys Lys Phe Arg Ala Leu Arg His Arg Glu 
1085 1090 1095 



Phe Glu Thr Ala Glu Ala Asp Ala Ser Ser Ser Thr Ser Gly Ser 
20 1100 1105 1110 

Asn Ser Leu Ser Ala Gly Ser Pro Arg Gin Ser Pro Val Pro Asn 
1115 1120 1125 

25 Ser Val Ala Thr Pro Pro Pro Val Ala Ala Ser Ala Ala Ala Gly 
1130 1135 1140 



Asn Pro Ala Gin Ser Gin Leu His Met His Leu Thr Arg Ser Ser 

1145 1150 1155 

Pro Lys Ala Ser Met Ala Ser Ser His Ser Val Leu Ala Lys Ser 

1160 1165 1170 



Leu Met Ala Glu Pro Arg Met Thr Pro Glu Gin Met Lys Arg Ser 
35 1175 1180 1185 

Asp lie lie Gin Asn Tyr Leu Lys Arg Glu Asn Ser Thr Ala Ala 
1190 1195 1200 

40 Ser Ser Thr Thr Asn Gly Leu Gly Asn Arg Ser Pro Ser Ser Ser 
1205 1210 1215 

Ser Thr Pro Pro Pro Ser Val Gin Asn Gin Gin Arg Trp Gly Ser 
1220 1225 1230 



Ser Ser Val lie Thr Thr Thr Cys Gin Gin Arg Gin Gin Ser Val 
1235 1240 1245 



Ser Pro His Ser Asn Gly Ser Ser Ser Ser Ser Ser Ser Ser Ser 
50 1250 1255 1260 

Ser Ser Ser Ser Ser Ser Ser Ser Thr Ser Ser Asn Cys Ser Ser 
1265 1270 1275 

55 Ser Ser Ala Ser Ser Cys Gin Tyr Phe Gin Ser Pro His Ser Thr 
1280 1285 1290 



Ser lie Gly Thr Gly Glu Pro Asp Gly Ala Pro Val Arg Asp Arg 
1295 1300 1305 



60 
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Thr Ala Pro Arg Pro Cys Trp Asn Cys Arg Trp Thr Leu Leu Thr 

1310 1315 1320 

Arg Arg Thr Ser Gin Phe Val Gin Glu lie Ala His Ala Ala Ala 

5 1325 1330 1335 

Gin Gin Ala Ala Arg Ser Gly Gly Arg Arg Gin Cys Arg Ser Lys 

1340 1345 1350 

10 Val Ser His lie Val Arg Arg Arg His Ser Asp Ser Leu Gin Trp 

1355 1360 1365 



15 



Arg Ser Ser Val Gly Gly Gly Glu Ser Gly Ala Gin Gin Gin Ser 
1370 1375 1380 

Ala Gly Glu Cys Gly Leu Pro Gin Ser Gly Pro Glu Arg Arg Arg 
1385 1390 1395 



Ala Gin Gly Asn Ala Gly Gly Val Arg Ala Gly Gly Gly Arg Trp 
20 1400 1405 1410 

Phe Tyr Ala Glu Lys Trp Glu Arg Gin Arg Leu Gly Val Ala Val 
1415 1420 1425 

25 Gin Arg Ser Arg Lys Gin Asp His Leu Glu Arg Arg Glu Leu Asn 
1430 1435 1440 



<210> 54 
30 <211> 690 
<212> PRT 

<213> Choristoneura fumiferana 
<400> 54 

35 Met Thr Leu Val Met Ser Pro Asp Ser Ser Tyr Gly Arg Tyr Asp Ala 
15 10 15 

Gin Pro Pro Val Asp Gly Gly Met Val Asn Pro Val His Arg Glu Arg 

20 25 30 

40 

Glu Pro Glu Leu His lie Glu Phe Asp Gly Thr Thr Val Leu Cys Arg 
35 40 45 

Val Cys Gly Asp Lys Ala Ser Gly Phe His Tyr Gly Val His Ser Cys 
45 50 55 60 

Glu Gly Cys Lys Gly Phe Phe Arg Arg Ser He Gin Gin Lys He Gin 
65 70 75 80 

50 Tyr Arg Pro Cys Thr Lys Asn Gin Gin Cys Ser He Leu Arg He Asn 

85 90 95 

Arg Asn Arg Cys Gin Tyr Cys Arg Leu Lys Lys Cys He Ala Val Gly 

100 105 110 

55 

Met Ser Arg Asp Ala Val Arg Phe Gly Arg Val Pro Lys Arg Glu Lys 
115 120 125 

Ala Arg He Leu Ala Ala Met Gin Gin Ser Ser Ser Ser Arg Ala His 
60 130 135 140 
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Glu Gin Ala Ala Ala Ala Glu Leu Asp Asp Ala Pro Arg Leu Leu Ala 
145 150 155 160 

Arg Val Val Arg Ala His Leu Asp Thr Cys Glu Phe Thr Arg Asp Arg 

165 170 175 

Val Ala Ala Met Arg Ala Arg Ala Arg Asp Cys Pro Thr Tyr Ser Gin 

180 185 190 

Pro Thr Leu Ala Cys Pro Leu Asn Pro Ala Pro Glu Leu Gin Ser Glu 
195 200 205 

Lys Glu Phe Ser Gin Arg Phe Ala His Val lie Arg Gly Val lie Asp 
210 215 220 

Phe Ala Gly Leu lie Pro Gly Phe Gin Leu Leu Thr Gin Asp Asp Lys 
225 230 235 240 

Phe Thr Leu Leu Lys Ser Gly Leu Phe Asp Ala Leu Phe Val Arg Leu 

245 250 255 

lie Cys Met Phe Asp Ala Pro Leu Asn Ser lie lie Cys Leu Asn Gly 

260 265 270 

Gin Leu Met Lys Arg Asp Ser lie Gin Ser Gly Ala Asn Ala Arg Phe 
275 280 285 

Leu Val Asp Ser Thr Phe Lys Phe Ala Glu Arg Met Asn Ser Met Asn 
290 295 300 

Leu Thr Asp Ala Glu lie Gly Leu Phe Cys Ala lie Val Leu lie Thr 
305 310 315 320 

Pro Asp Arg Pro Gly Leu Arg Asn lie Glu Leu Val Glu Arg Met His 

325 330 335 

Ala Arg Leu Lys Ser Cys Leu Gin Thr Val lie Ala Gin Asn Arg Ala 

340 345 350 

Asp Arg Pro Gly Phe Leu Arg Glu Leu Met Asp Thr Leu Pro Asp Leu 
355 360 365 

Arg Thr Leu Ser Thr Leu His Thr Glu Lys Leu Val Val Phe Arg Thr 
370 375 380 

Glu His Lys Glu Leu Leu Arg Gin Gin Met Trp Gly Asp Glu Glu Val 
385 390 395 400 

Cys Pro Trp Ala Asp Ser Gly Val Asp Asp Ser Ala Arg Ser Pro Leu 

405 410 415 

Gly Ser Val Ser Ser Ser Glu Ser Gly Glu Ala Pro Ser Asp Cys Gly 

420 425 430 

Thr Pro Leu Leu Ala Ala Thr Leu Ala Gly Arg Arg Arg Leu Asp Ser 
435 440 445 

Arg Gly Ser Val Asp Glu Glu Ala Leu Gly Val Ala His Leu Ala His 
450 455 460 
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Asn Gly Leu Thr Val Thr Pro Val Arg Pro Pro Pro Arg Tyr Arg Lys 
465 470 475 480 

Leu Asp Ser Pro Thr Asp Ser Gly lie Glu Ser Gly Asn Glu Lys His 

485 490 495 

Glu Arg lie Val Gly Pro Gly Ser Gly Cys Ser Ser Pro Arg Ser Ser 

500 505 510 

Leu Glu Glu His Met Glu Asp Arg Arg Pro Leu Ala Ala Asp Asp Met 
515 520 525 

Pro Val Leu Lys Arg Val Leu Gin Ala Pro Pro Leu Tyr Asp Ala Ser 
530 535 540 

Ser Leu Met Asp Glu Ala Tyr Lys Pro His Lys Lys Phe Arg Ala Met 
545 550 555 560 

Arg Arg Asp Thr Gly Glu Ala Glu Ala Arg Pro Met Arg Pro Thr Pro 

565 570 575 

Ser Pro Gin Pro Met His Pro His Pro Gly Ser Pro Ala His Pro Ala 

580 585 590 

His Pro Ala His Ser Pro Arg Pro Leu Arg Ala Pro Leu Ser Ser Thr 
595 600 605 

His Ser Val Leu Ala Lys Ser Leu Met Glu Gly Pro Arg Met Thr Pro 
610 615 620 

Glu Gin Leu Lys Arg Thr Asp lie lie Gin Gin Tyr Met Arg Arg Gly 
625 630 635 640 

Glu Ala Gly Glu Glu Cys Arg Ala Gly Leu Leu Leu Tyr Arg Gly Ala 

645 650 655 

Ser Pro Leu Gin Val Asp Val Ala Asp Ala Pro Gin Pro Leu Asn Leu 

660 665 670 

Ser Lys Lys Ser Pro Ser Pro Pro Arg Ser Phe Met Pro Pro Met Leu 
675 680 685 

Glu Ala 

690 



<210> 55 

<211> 711 

<212> PRT 

<213> Galleria 

<400> 55 

Met Thr Leu Val 
1 

Pro Ala Pro Ala 

20 

Glu Pro Glu Leu 
35 



mellonella 

Met Ser Pro Asp 
5 

Asp Asn Arg lie 

His lie Glu Phe 

40 



Ser Ser Tyr Gly 
10 

Met Ser Pro Val 
25 

Asp Gly Thr Thr 



Arg Tyr Asp Ala 
15 

His Lys Glu Arg 
30 

Val Leu Cys Arg 
45 
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Val Cys Gly Asp Lys Ala Ser Gly Phe His Tyr Gly Val His Ser Cys 
50 55 60 

Glu Gly Cys Lys Gly Phe Phe Arg Arg Ser lie Gin Gin Lys lie Gin 
65 70 75 80 

Tyr Arg Pro Cys Thr Lys Asn Gin Gin Cys Ser lie Leu Arg lie Asn 

85 90 95 

Arg Asn Arg Cys Gin Tyr Cys Arg Leu Lys Lys Cys lie Ala Val Gly 

100 105 110 

Met Ser Arg Asp Ala Val Arg Phe Gly Arg Val Pro Lys Arg Glu Lys 
115 120 125 

Ala Arg lie Leu Ala Ala Met Gin Ser Ser Thr Thr Arg Ala His Glu 
130 135 140 

Gin Ala Ala Ala Ala Glu Leu Asp Asp Gly Pro Arg Leu Leu Ala Arg 
145 150 155 160 

Val Val Arg Ala His Leu Asp Thr Cys Glu Phe Thr Arg Asp Arg Val 

165 170 175 

Ala Ala Met Arg Asn Gly Ala Arg Asp Cys Pro Thr Tyr Ser Gin Pro 

180 185 190 

Thr Leu Ala Cys Pro Leu Asn Pro Ala Pro Glu Leu Gin Ser Glu Lys 
195 200 205 

Glu Phe Ser Gin Arg Phe Ala His Val lie Arg Gly Val lie Asp Phe 
210 215 220 

Ala Gly Leu lie Pro Gly Phe Gin Leu Leu Thr Gin Asp Asp Lys Phe 
225 230 235 240 

Thr Leu Leu Lys Ser Gly Leu Phe Asp Ala Leu Phe Val Arg Leu lie 

245 250 255 

Cys Met Phe Asp Ala Pro Leu Asn Ser lie lie Cys Leu Asn Gly Gin 

260 265 270 

Leu Met Lys Arg Asp Ser lie Gin Ser Gly Ala Asn Ala Arg Phe Leu 
275 280 285 

Val Asp Ser Thr Phe Lys Phe Ala Glu Arg Met Asn Ser Met Asn Leu 
290 295 300 

Thr Asp Ala Glu lie Gly Leu Phe Cys Ala lie Val Leu lie Thr Pro 
305 310 315 320 

Asp Arg Pro Gly Leu Arg Asn Val Glu Leu Val Glu Arg Met His Ser 

325 330 335 

Arg Leu Lys Ser Cys Leu Gin Thr Val lie Ala Gin Asn Arg Ser Asp 

340 345 350 

Gly Pro Gly Phe Leu Arg Glu Leu Met Asp Thr Leu Pro Asp Leu Arg 
355 360 365 



93/97 



WO 02/077157 



PCT/US02/11257 



Thr Leu Ser Thr 
370 

His Lys Glu Leu 
385 

Leu Trp Ala Asp 



Ser Val Ser Ser 

420 

Pro Leu Leu Ala 
435 

Gly Ser Val Asp 
450 

Gly Leu Thr Val 
465 

Asp Ser Pro Thr 



Arg He Val Gly 

500 

Glu Glu His Ser 
515 

Pro Val Leu Lys 
530 

Ser Leu Met Asp 
545 

Arg Arg Asp Thr 



Pro Ser Pro Gin 

580 

Pro Ala His Ser 
595 

Ser Val Leu Ala 
610 

Gin Leu Lys Arg 
625 

Thr Gly Ala Pro 



Thr Cys Phe Arg 

660 

Gin Val Asp Val 
675 



Leu His Thr Glu 
375 

Leu Arg Gin Gin 

390 

Ser Gly Ala Asp 
405 

Ser Glu Ser Ser 



Ala Thr Leu Ala 

440 

Glu Glu Ala Leu 
455 

Thr Pro Val Arg 
470 

Asp Ser Gly He 
485 

Pro Glu Ser Gly 



Asp Asp Arg Arg 

520 

Arg Val Leu Gin 
535 

Glu Ala Tyr Lys 
550 

Trp Ser Glu Ala 
565 

Pro Pro His His 



Pro Arg Pro He 

600 

Lys Ser Leu Met 
615 

Thr Asp He He 
630 

Thr Glu Gly Cys 
645 

Gly Ala Ser Pro 



Ala Glu Thr Asp 

680 



Lys Leu Val Val 

380 

Met Trp Val Glu 
395 

Asp Ser Ala Arg 
410 

Glu Thr Thr Gly 
425 

Gly Arg Arg Arg 



Gly Val Ala His 

460 

Pro Pro Pro Arg 
475 

Glu Ser Gly Asn 
490 

Cys Ser Ser Pro 
505 

Pro He Ala Pro 



Ala Pro Pro Leu 

540 

Pro His Lys Lys 
555 

Glu Ala Arg Pro 
570 

Pro His Pro Ala 
585 

Arg Ala Pro Leu 



Glu Gly Pro Arg 

620 

Gin Gin Tyr Met 
635 

Pro Leu Arg Ala 
650 

Ala Pro Gin Pro 
665 

Ala Pro Gin Pro 



Phe Arg Thr Glu 



Asp Glu Gly Ala 

400 

Ser Pro He Gly 
415 

Asp Cys Gly Thr 
430 

Leu Asp Ser Arg 
445 

Leu Ala His Asn 



Tyr Arg Lys Leu 

480 

Glu Lys His Glu 
495 

Arg Ser Ser Leu 
510 

Ala Asp Asp Met 
525 

Tyr Asp Ala Ser 



Phe Arg Ala Met 

560 

Gly Arg Pro Thr 
575 

Ser Pro Ala His 
590 

Ser Ser Thr His 
605 

Met Thr Pro Glu 



Arg Arg Gly Glu 

640 

Gly Gly Leu Leu 
655 

Val He Ala Leu 
670 

Leu Asn Leu Ser 
685 
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Lys Lys Ser Pro Ser Pro Ser Pro Pro Pro Pro Pro Pro Arg Ser Tyr 
690 695 700 

Met Pro Pro Met Leu Pro Ala 
705 710 



<210> 56 
<211> 711 
<212> PRT 

<213> Metapenaeus ensis 
<400> 56 

Met Thr Leu Val Met Ser Pro Asp Ser Ser Tyr Gly Arg Tyr Asp Ala 
15 10 15 

Pro Ala Pro Ala Asp Asn Arg lie Met Ser Pro Val His Lys Glu Arg 

20 25 30 

Glu Pro Glu Leu His lie Glu Phe Asp Gly Thr Thr Val Leu Cys Arg 
35 40 45 

Val Cys Gly Asp Lys Ala Ser Gly Phe His Tyr Gly Val His Ser Cys 
50 55 60 

Glu Gly Cys Lys Gly Phe Phe Arg Arg Ser lie Gin Gin Lys lie Gin 
65 70 75 80 

Tyr Arg Pro Cys Thr Lys Asn Gin Gin Cys Ser lie Leu Arg He Asn 

85 90 95 

Arg Asn Arg Cys Gin Tyr Cys Arg Leu Lys Lys Cys He Ala Val Gly 

100 105 110 

Met Ser Arg Asp Ala Val Arg Phe Gly Arg Val Pro Lys Arg Glu Lys 
115 120 125 

Ala Arg He Leu Ala Ala Met Gin Ser Ser Thr Thr Arg Ala His Glu 
130 135 140 

Gin Ala Ala Ala Ala Glu Leu Asp Asp Gly Pro Arg Leu Leu Ala Arg 
145 150 155 160 

Val Val Arg Ala His Leu Asp Thr Cys Glu Phe Thr Arg Asp Arg Val 

165 170 175 

Ala Ala Met Arg Asn Gly Ala Arg Asp Cys Pro Thr Tyr Ser Gin Pro 

180 185 190 

Thr Leu Ala Cys Pro Leu Asn Pro Ala Pro Glu Leu Gin Ser Glu Lys 
195 200 205 

Glu Phe Ser Gin Arg Phe Ala His Val He Arg Gly Val He Asp Phe 
210 215 220 

Ala Gly Leu He Pro Gly Phe Gin Leu Leu Thr Gin Asp Asp Lys Phe 
225 230 235 240 

Thr Leu Leu Lys Ser Gly Leu Phe Asp Ala Leu Phe Val Arg Leu He 

245 250 255 
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Cys Met Phe Asp 

260 

Leu Met Lys Arg 
275 

Val Asp Ser Thr 
290 

Thr Asp Ala Glu 
305 

Asp Arg Pro Gly 



Arg Leu Lys Ser 

340 

Gly Pro Gly Phe 
355 

Thr Leu Ser Thr 
370 

His Lys Glu Leu 
385 

Leu Trp Ala Asp 



Ser Val Ser Ser 

420 

Pro Leu Leu Ala 
435 

Gly Ser Val Asp 
450 

Gly Leu Thr Val 
465 

Asp Ser Pro Thr 



Arg lie Val Gly 

500 

Glu Glu His Ser 
515 

Pro Val Leu Lys 
530 

Ser Leu Met Asp 
545 

Arg Arg Asp Thr 



Ala Pro Leu Asn 



Asp Ser lie Gin 

280 

Phe Lys Phe Ala 
295 

lie Gly Leu Phe 
310 

Leu Arg Asn Val 
325 

Cys Leu Gin Thr 



Leu Arg Glu Leu 

360 

Leu His Thr Glu 
375 

Leu Arg Gin Gin 
390 

Ser Gly Ala Asp 
405 

Ser Glu Ser Ser 



Ala Thr Leu Ala 

440 

Glu Glu Ala Leu 
455 

Thr Pro Val Arg 
470 

Asp Ser Gly lie 
485 

Pro Glu Ser Gly 



Asp Asp Arg Arg 

520 

Arg Val Leu Gin 
535 

Glu Ala Tyr Lys 
550 

Trp Ser Glu Ala 
565 



Ser lie lie Cys 
265 

Ser Gly Ala Asn 



Glu Arg Met Asn 

300 

Cys Ala lie Val 
315 

Glu Leu Val Glu 
330 

Val lie Ala Gin 
345 

Met Asp Thr Leu 



Lys Leu Val Val 

380 

Met Trp Val Glu 
395 

Asp Ser Ala Arg 
410 

Glu Thr Thr Gly 

425 

Gly Arg Arg Arg 



Gly Val Ala His 

460 

Pro Pro Pro Arg 
475 

Glu Ser Gly Asn 
490 

Cys Ser Ser Pro 
505 

Pro He Ala Pro 



Ala Pro Pro Leu 

540 

Pro His Lys Lys 
555 

Glu Ala Arg Pro 
570 



Leu Asn Gly Gin 
270 

Ala Arg Phe Leu 
285 

Ser Met Asn Leu 



Leu He Thr Pro 

320 

Arg Met His Ser 
335 

Asn Arg Ser Asp 
350 

Pro Asp Leu Arg 
365 

Phe Arg Thr Glu 



Asp Glu Gly Ala 

400 

Ser Pro He Gly 
415 

Asp Cys Gly Thr 
430 

Leu Asp Ser Arg 
445 

Leu Ala His Asn 



Tyr Arg Lys Leu 

480 

Glu Lys His Glu 
495 

Arg Ser Ser Leu 
510 

Ala Asp Asp Met 
525 

Tyr Asp Ala Ser 



Phe Arg Ala Met 

560 

Gly Arg Pro Thr 
575 



96/97 



WO 02/077157 



PCT/US02/11257 



Pro Ser Pro Gin Pro Pro His His Pro His Pro Ala Ser Pro Ala His 

580 585 590 

Pro Ala His Ser Pro Arg Pro lie Arg Ala Pro Leu Ser Ser Thr His 
5 595 600 605 

Ser Val Leu Ala Lys Ser Leu Met Glu Gly Pro Arg Met Thr Pro Glu 
610 615 620 

10 Gin Leu Lys Arg Thr Asp lie lie Gin Gin Tyr Met Arg Arg Gly Glu 
625 630 635 640 



15 



Thr Gly Ala Pro Thr Glu Gly Cys Pro Leu Arg Ala Gly Gly Leu Leu 

645 650 655 

Thr Cys Phe Arg Gly Ala Ser Pro Ala Pro Gin Pro Val lie Ala Leu 

660 665 670 



Gin Val Asp Val Ala Glu Thr Asp Ala Pro Gin Pro Leu Asn Leu Ser 
20 675 680 685 

Lys Lys Ser Pro Ser Pro Ser Pro Pro Pro Pro Pro Pro Arg Ser Tyr 
690 695 700 

25 Met Pro Pro Met Leu Pro Ala 

705 710 
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